FLaNK Stack Weekly for 14 August 2023

Timothy SpannTimothy Spann
2 min read

14-August-2023

FLiPN-FLaNK Stack Weekly

Tim Spann @PaaSDev

https://www.threads.net/@tspannhw

https://medium.com/@tspann/subscribe

A lot is going on and it's starting the fast rush towards Fall when there are Flink, Kafka, Apache and other conferences through out North America.

cats

Get your new Apache NiFi for Dummies!

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

https://ossinsight.io/analyze/tspannhw

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

This is Issue #98

https://github.com/tspannhw/FLiPStackWeekly

https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

https://www.cloudera.com/solutions/dim-developer.html

Releases

EFM 1.6.0 https://docs.cloudera.com/cem/1.6.0/getting-started/topics/cem-component-support.html

CEM MiNiFi C++ Agent - 1.23.06 https://docs.cloudera.com/cem/1.6.0/release-notes-minifi-cpp/topics/cem-minifi-cpp-agent-updates.html

CEM MiNiFi Java Agent - 1.23.04 https://docs.cloudera.com/cem/1.6.0/release-notes-minifi-java/topics/cem-minifi-java-agent-updates.html

Docs

https://docs.cloudera.com/cem/1.6.0/rest-api-reference/index.html

https://leanpub.com/streamprocessingwithapacheflink/c/ucQ5dLcZYAo2?utm_source=substack&utm_medium=email

https://docs.cloudera.com/cem/1.6.0/using-cem/topics/cem-agent-deployer-securing-agents.html

https://docs.cloudera.com/cem/latest/installation/topics/cem-set-encryption-password.html

Videos

https://www.youtube.com/watch?v=zEGffUz1jKo

https://www.youtube.com/watch?v=rQo3Pk5smz8

https://www.youtube.com/watch?v=0G98z_fs_SQ&t=605s&ab_channel=DataScienceFestival

https://www.youtube.com/watch?v=JdsY5p1GZ38&t=29s&ab_channel=DatainMotion

https://www.youtube.com/watch?v=nuS3X5DxFWM&ab_channel=DatainMotion

Articles

https://medium.com/@tspann/using-apache-nifi-to-backup-and-restore-minifi-flows-from-cloudera-efm-87f303b56ebd

https://medium.com/@tspann/no-code-sentiment-analysis-with-hugging-face-and-apache-nifi-for-article-summaries-cf06d1df1283

https://medium.com/@tspann/hbase-to-hbase-via-apache-nifi-d3d1d674eab2

https://www.playtika-blog.com/playtika-ai/how-playtika-achieved-ai-automation-customer-service-with-apache-nifi-part-2/

https://docs.cloudera.com/cem/1.6.0/using-minifi-as-log-collector-pod-in-kubernetes/topics/cem-using-minifi-as-log-collector-pod-in-kubernetes.html

https://docs.cloudera.com/cem/1.6.0/using-scripting/topics/cem-script-initial-setup.html#cem-using-script-to-integrate-custom-code

https://medium.com/geekculture/decision-making-with-linked-data-event-streams-and-powerbi-5cd8379d32

https://medium.com/@samuel.vanackere/linked-data-event-streams-explained-in-8-minutes-e1c76d077bb9

https://medium.com/geekculture/decision-making-with-linked-data-event-streams-and-powerbi-5cd8379d32

https://hilla.dev/blog/ai-chatbot-in-java/

https://www.linkedin.com/posts/nicholasrenotte_watsonx-llms-mlops-activity-7093359957890240512-f8RZ/

https://cloudinfrastructure.substack.com/p/introducing-the-redpoint-open-source

https://www.loicmathieu.fr/wordpress/en/informatique/java-21-quoi-de-neuf/

https://litellm.ai/

https://semiconductor.samsung.com/news-events/tech-blog/samsung-announces-innovations-to-enhance-memory-customer-experience-in-data-centric-era-at-fms-2023/

https://kevinbtalbert.github.io/iceberg/nifi/nifi-iceberg/

Free Stuff

For anyone who needs to upgrade Java or escape from potential liabilities, this is the guide. It's also provides some helpful insights for any Java developer or anyone developing on-top of current or future JVMs. https://www.azul.com/openjdk-migration-for-dummies/

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

Throw Back Articles

https://github.com/apache/kudu/blob/master/examples/quickstart/impala/README.adoc https://medium.com/@nifi.notes/building-an-effective-nifi-flow-replacetext-60a6016d378c https://community.cloudera.com/t5/Community-Articles/Running-DNS-and-Domain-Scanning-Tools-From-Apache-NiFi/ta-p/248484 https://community.cloudera.com/t5/Community-Articles/Using-Cloudera-Data-Science-Workbench-with-Apache-NiFi-and/ta-p/249469 https://community.cloudera.com/t5/Community-Articles/Scanning-Documents-into-Data-Lakes-via-Tesseract-MQTT-Python/ta-p/248492 https://community.cloudera.com/t5/Community-Articles/Adding-Stanford-CoreNLP-To-Big-Data-Pipelines-Apache-NiFi-1/ta-p/249378 https://community.cloudera.com/t5/Community-Articles/Using-Apache-NiFi-for-Speech-Processing-Speech-to-Text-with/ta-p/249242 https://community.cloudera.com/t5/Community-Articles/Ingesting-Flight-Data-ADS-B-USB-Receiver-with-Apache-NiFi-1/ta-p/247940 https://community.cloudera.com/t5/Community-Articles/Integrating-lucene-geo-gazetteer-For-Geo-Parsing-with-Apache/ta-p/247993 https://community.cloudera.com/t5/Community-Articles/Creating-WordClouds-From-DataFlows-with-Apache-NiFi-and/ta-p/246605 https://community.cloudera.com/t5/Community-Articles/NIFI-1-x-For-Automatic-Music-Playing-Pipelines/ta-p/247994 https://community.cloudera.com/t5/Community-Articles/Using-Apache-NiFi-with-Apache-MXNet-GluonCV-for-YOLO-3-Deep/ta-p/248979 https://community.cloudera.com/t5/Community-Articles/Tracking-Air-Quality-with-HDP-and-HDF-Part-1-Apache-NiFi/ta-p/248265 https://community.cloudera.com/t5/Community-Articles/Monitoring-Energy-Usage-Utilizing-Apache-NiFi-Python-Apache/ta-p/247525 https://community.cloudera.com/t5/Community-Articles/Using-Command-Line-Security-Tools-from-Apache-NiFi/ta-p/248158 https://community.cloudera.com/t5/Community-Articles/Apache-NiFi-Processor-for-Apache-MXNet-SSD-Single-Shot/ta-p/249240 https://community.cloudera.com/t5/Community-Articles/Ingesting-Apache-MXNet-Gluon-Deep-Learning-Results-Via-MQTT/ta-p/248544 https://community.cloudera.com/t5/Community-Articles/Updating-The-Apache-OpenNLP-Community-Apache-NiFi-Processor/ta-p/248398 https://community.cloudera.com/t5/Community-Articles/Integration-Apache-OpenNLP-1-8-4-into-Apache-NiFi-1-5-For/ta-p/248010 https://community.cloudera.com/t5/Community-Articles/Tracking-Phone-Location-for-Android-and-IoT-with-OwnTracks/ta-p/244875 https://community.cloudera.com/t5/Community-Articles/Ingesting-Drone-Data-From-Ryze-Tello-Part-1-Setup-and/ta-p/249422 https://community.cloudera.com/t5/Community-Articles/Ingesting-RDBMS-Data-As-New-Tables-Arrive-Automagically-into/ta-p/246214 https://community.cloudera.com/t5/Community-Articles/Incrementally-Streaming-RDBMS-Data-to-Your-Hadoop-DataLake/ta-p/247927 https://community.cloudera.com/t5/Community-Articles/Ingesting-and-Analyzing-Street-Camera-Data-from-Major-US/ta-p/249194 https://community.cloudera.com/t5/Community-Articles/Basic-Image-Processing-and-Linux-Utilities-As-Part-of-a-Big/ta-p/249121 https://community.cloudera.com/t5/Community-Articles/Hosting-and-Ingesting-Data-From-Web-Pages-Desktop-and-Mobile/ta-p/244575 https://community.cloudera.com/t5/Community-Articles/QADCDC-Our-how-to-ingest-some-database-tables-to-Hadoop-Very/ta-p/245229 https://community.cloudera.com/t5/Community-Articles/Tracking-Air-Quality-with-HDP-and-HDF-Part-2-Indoor-Air/ta-p/249471 https://community.cloudera.com/t5/Community-Articles/Streaming-Ingest-of-Google-Sheets-with-HDF-2-0/ta-p/247764 https://community.cloudera.com/t5/Community-Articles/Ingesting-Golden-Gate-Records-From-Apache-Kafka-and/ta-p/247557 https://community.cloudera.com/t5/Community-Articles/Data-Processing-Pipeline-Parsing-PDFs-and-Identifying-Names/ta-p/249105 https://community.cloudera.com/t5/Community-Articles/Using-A-TensorFlow-quot-Person-Blocker-quot-With-Apache-NiFi/ta-p/248141 https://community.cloudera.com/t5/Community-Articles/Su-Su-Sussudio-Sudoers-Log-Parsing-with-Apache-NiFi/ta-p/249461 https://community.cloudera.com/t5/Community-Articles/Integrating-IBM-Watson-Machine-Learning-APIs-with-Apache/ta-p/247545 https://community.cloudera.com/t5/Community-Articles/Simple-Change-Data-Capture-CDC-with-SQL-Selects-via-Apache/ta-p/308376 https://community.cloudera.com/t5/Community-Articles/Deep-Learning-IoT-Workflows-with-Raspberry-Pi-MQTT-MXNet/ta-p/249456 https://community.cloudera.com/t5/Community-Articles/Parsing-Web-Pages-for-Images-with-Apache-NiFi/ta-p/248415 https://community.cloudera.com/t5/Community-Articles/Trigger-SonicPi-Music-Via-Apache-NiFi/ta-p/248587 https://community.cloudera.com/t5/Community-Articles/Using-Parsey-McParseFace-Google-TensorFlow-Syntaxnet-From/ta-p/246337 https://community.cloudera.com/t5/Community-Articles/Ingesting-osquery-Into-Apache-Phoenix-using-Apache-NiFi/ta-p/249308 https://community.cloudera.com/t5/Community-Articles/Converting-PowerPoint-Presentations-into-French-from-English/ta-p/248974 https://community.cloudera.com/t5/Community-Articles/Posting-Images-with-Apache-NiFi-1-7-and-a-Custom-Processor/ta-p/249017 https://community.cloudera.com/t5/Community-Articles/Parsing-Any-Document-with-Apache-NiFi-1-5-with-Apache-Tika/ta-p/247672

Events

https://attend.cloudera.com/ameropendatalakehousewithcdpon?lid=7vxyhds3tlv7

August 23, 2023: NYC. AI. https://www.aicamp.ai/event/eventdetails/W2023082314

September 26-27, 2023: Current Event. San Jose, California. https://www.confluent.io/events/current/

October 7-10, 2023: Halifax, CA. Community over Code. https://communityovercode.org/

October 8, 2023: Streaming Track, Room 102 https://communityovercode.org/schedule/#Oct8 https://communityovercode.org/schedule-list/#SG007 https://communityovercode.org/schedule-list/#SG011

October 10, 2023: Internet of Things Track, Room 109 https://communityovercode.org/schedule/#Oct10 https://communityovercode.org/schedule-list/#IOT001

October 18, 2023: 2-Hours to Data Innovation: Data Flow https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html

November 2, 2023: Evolve. NYC https://www.cloudera.com/about/events/evolve/new-york.html#register

November 8, 2023: Flink Forward, Seattle. https://www.flink-forward.org/seattle-2023

November 22, 2023: Big Data Conference. Hybrid
https://bigdataconference.eu/

Cloudera Events https://www.cloudera.com/about/events.html

More Events: https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

Code

  • https://github.com/tspannhw/FLaNK-Edge/tree/main
  • https://github.com/tspannhw/FLaNK-Edge-Models
  • https://github.com/tspannhw/FLaNK-HTAP

Tools

  • https://github.com/fluent/fluent-bit
  • https://github.com/jdb78/pytorch-forecasting
  • https://www.symmetricds.org/
  • https://github.com/cloudera/CML_AMP_MLflow_Tracking
  • https://github.com/ogakulov/CML_AMP_Churn_Prediction_mlflow
  • https://github.com/Wisser/Jailer
  • https://flatdraw.com/
  • https://google.github.io/typograms/#installation
  • https://github.com/mukel/llama2.java
  • https://meoweler.com/
  • https://github.com/karpathy/llama2.c
  • https://github.com/simonmesmith/agentflow
  • https://stability.ai/blog/stablecode-llm-generative-ai-coding
  • https://dukope.itch.io/lcd-please
  • https://github.com/feldera/dbsp
  • https://github.com/libsql/libsql
  • https://github.com/xusenlinzy/api-for-open-llm
  • https://platform.openai.com/docs/guides/embeddings/what-are-embeddings
  • https://github.com/openai/chatgpt-retrieval-plugin
  • https://www.pythongasm.com/build-gpt-powered-chatbots-around-enterprise-data-with-python
  • https://github.com/langgenius/dify
  • https://github.com/languagetool-org/languagetool
  • https://github.com/alipay/fury
  • https://github.com/morph-labs/rift
  • https://github.com/teaxyz/cli
  • https://github.com/Aiven-Open/sql-cli-for-apache-flink-docker/releases/tag/1.17.1
  • https://github.com/Aiven-Open/jdbc-connector-for-apache-kafka
  • https://towhee.io/tasks/detail/pipeline/retrieval-augmented-generation
  • https://github.com/towhee-io/towhee
  • https://github.com/jason-jz-zhu/databathing
  • https://github.com/microsoft/Llama-2-Onnx
  • https://www.mlexpert.io/prompt-engineering/langchain-quickstart
  • https://foojay.io/today/pi4j-operating-system-for-java-development-on-raspberry-pi/
  • https://www.cambioml.com/pykoi/
  • https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-server/readme.md
  • https://litellm.ai/
  • https://redpoint.metabaseapp.com/public/dashboard/5e802588-cc2c-489c-a2f3-283d6c3cd298
  • https://github.com/HumanSignal/label-studio
  • https://github.com/daefresh/awesome-data-temporality

© 2020-2023 Tim Spann

0
Subscribe to my newsletter

Read articles from Timothy Spann directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Timothy Spann
Timothy Spann