FLaNK Stack Weekly for 14 August 2023
14-August-2023
FLiPN-FLaNK Stack Weekly
Tim Spann @PaaSDev
https://www.threads.net/@tspannhw
https://medium.com/@tspann/subscribe
A lot is going on and it's starting the fast rush towards Fall when there are Flink, Kafka, Apache and other conferences through out North America.
Get your new Apache NiFi for Dummies!
https://www.cloudera.com/campaign/apache-nifi-for-dummies.html
https://ossinsight.io/analyze/tspannhw
CODE + COMMUNITY
Please join my meetup group NJ/NYC/Philly/Virtual.
http://www.meetup.com/futureofdata-princeton/
https://www.meetup.com/futureofdata-newyork/
https://www.meetup.com/futureofdata-philadelphia/
This is Issue #98
https://github.com/tspannhw/FLiPStackWeekly
https://www.linkedin.com/pulse/schedule-2023-tim-spann-/
https://www.cloudera.com/solutions/dim-developer.html
Releases
EFM 1.6.0 https://docs.cloudera.com/cem/1.6.0/getting-started/topics/cem-component-support.html
CEM MiNiFi C++ Agent - 1.23.06 https://docs.cloudera.com/cem/1.6.0/release-notes-minifi-cpp/topics/cem-minifi-cpp-agent-updates.html
CEM MiNiFi Java Agent - 1.23.04 https://docs.cloudera.com/cem/1.6.0/release-notes-minifi-java/topics/cem-minifi-java-agent-updates.html
Docs
https://docs.cloudera.com/cem/1.6.0/rest-api-reference/index.html
https://leanpub.com/streamprocessingwithapacheflink/c/ucQ5dLcZYAo2?utm_source=substack&utm_medium=email
https://docs.cloudera.com/cem/1.6.0/using-cem/topics/cem-agent-deployer-securing-agents.html
https://docs.cloudera.com/cem/latest/installation/topics/cem-set-encryption-password.html
Videos
https://www.youtube.com/watch?v=zEGffUz1jKo
https://www.youtube.com/watch?v=rQo3Pk5smz8
https://www.youtube.com/watch?v=0G98z_fs_SQ&t=605s&ab_channel=DataScienceFestival
https://www.youtube.com/watch?v=JdsY5p1GZ38&t=29s&ab_channel=DatainMotion
https://www.youtube.com/watch?v=nuS3X5DxFWM&ab_channel=DatainMotion
Articles
https://medium.com/@tspann/using-apache-nifi-to-backup-and-restore-minifi-flows-from-cloudera-efm-87f303b56ebd
https://medium.com/@tspann/no-code-sentiment-analysis-with-hugging-face-and-apache-nifi-for-article-summaries-cf06d1df1283
https://medium.com/@tspann/hbase-to-hbase-via-apache-nifi-d3d1d674eab2
https://www.playtika-blog.com/playtika-ai/how-playtika-achieved-ai-automation-customer-service-with-apache-nifi-part-2/
https://docs.cloudera.com/cem/1.6.0/using-minifi-as-log-collector-pod-in-kubernetes/topics/cem-using-minifi-as-log-collector-pod-in-kubernetes.html
https://docs.cloudera.com/cem/1.6.0/using-scripting/topics/cem-script-initial-setup.html#cem-using-script-to-integrate-custom-code
https://medium.com/geekculture/decision-making-with-linked-data-event-streams-and-powerbi-5cd8379d32
https://medium.com/@samuel.vanackere/linked-data-event-streams-explained-in-8-minutes-e1c76d077bb9
https://medium.com/geekculture/decision-making-with-linked-data-event-streams-and-powerbi-5cd8379d32
https://hilla.dev/blog/ai-chatbot-in-java/
https://www.linkedin.com/posts/nicholasrenotte_watsonx-llms-mlops-activity-7093359957890240512-f8RZ/
https://cloudinfrastructure.substack.com/p/introducing-the-redpoint-open-source
https://www.loicmathieu.fr/wordpress/en/informatique/java-21-quoi-de-neuf/
https://litellm.ai/
https://semiconductor.samsung.com/news-events/tech-blog/samsung-announces-innovations-to-enhance-memory-customer-experience-in-data-centric-era-at-fms-2023/
https://kevinbtalbert.github.io/iceberg/nifi/nifi-iceberg/
Free Stuff
For anyone who needs to upgrade Java or escape from potential liabilities, this is the guide. It's also provides some helpful insights for any Java developer or anyone developing on-top of current or future JVMs. https://www.azul.com/openjdk-migration-for-dummies/
https://www.cloudera.com/campaign/apache-nifi-for-dummies.html
Throw Back Articles
https://github.com/apache/kudu/blob/master/examples/quickstart/impala/README.adoc https://medium.com/@nifi.notes/building-an-effective-nifi-flow-replacetext-60a6016d378c https://community.cloudera.com/t5/Community-Articles/Running-DNS-and-Domain-Scanning-Tools-From-Apache-NiFi/ta-p/248484 https://community.cloudera.com/t5/Community-Articles/Using-Cloudera-Data-Science-Workbench-with-Apache-NiFi-and/ta-p/249469 https://community.cloudera.com/t5/Community-Articles/Scanning-Documents-into-Data-Lakes-via-Tesseract-MQTT-Python/ta-p/248492 https://community.cloudera.com/t5/Community-Articles/Adding-Stanford-CoreNLP-To-Big-Data-Pipelines-Apache-NiFi-1/ta-p/249378 https://community.cloudera.com/t5/Community-Articles/Using-Apache-NiFi-for-Speech-Processing-Speech-to-Text-with/ta-p/249242 https://community.cloudera.com/t5/Community-Articles/Ingesting-Flight-Data-ADS-B-USB-Receiver-with-Apache-NiFi-1/ta-p/247940 https://community.cloudera.com/t5/Community-Articles/Integrating-lucene-geo-gazetteer-For-Geo-Parsing-with-Apache/ta-p/247993 https://community.cloudera.com/t5/Community-Articles/Creating-WordClouds-From-DataFlows-with-Apache-NiFi-and/ta-p/246605 https://community.cloudera.com/t5/Community-Articles/NIFI-1-x-For-Automatic-Music-Playing-Pipelines/ta-p/247994 https://community.cloudera.com/t5/Community-Articles/Using-Apache-NiFi-with-Apache-MXNet-GluonCV-for-YOLO-3-Deep/ta-p/248979 https://community.cloudera.com/t5/Community-Articles/Tracking-Air-Quality-with-HDP-and-HDF-Part-1-Apache-NiFi/ta-p/248265 https://community.cloudera.com/t5/Community-Articles/Monitoring-Energy-Usage-Utilizing-Apache-NiFi-Python-Apache/ta-p/247525 https://community.cloudera.com/t5/Community-Articles/Using-Command-Line-Security-Tools-from-Apache-NiFi/ta-p/248158 https://community.cloudera.com/t5/Community-Articles/Apache-NiFi-Processor-for-Apache-MXNet-SSD-Single-Shot/ta-p/249240 https://community.cloudera.com/t5/Community-Articles/Ingesting-Apache-MXNet-Gluon-Deep-Learning-Results-Via-MQTT/ta-p/248544 https://community.cloudera.com/t5/Community-Articles/Updating-The-Apache-OpenNLP-Community-Apache-NiFi-Processor/ta-p/248398 https://community.cloudera.com/t5/Community-Articles/Integration-Apache-OpenNLP-1-8-4-into-Apache-NiFi-1-5-For/ta-p/248010 https://community.cloudera.com/t5/Community-Articles/Tracking-Phone-Location-for-Android-and-IoT-with-OwnTracks/ta-p/244875 https://community.cloudera.com/t5/Community-Articles/Ingesting-Drone-Data-From-Ryze-Tello-Part-1-Setup-and/ta-p/249422 https://community.cloudera.com/t5/Community-Articles/Ingesting-RDBMS-Data-As-New-Tables-Arrive-Automagically-into/ta-p/246214 https://community.cloudera.com/t5/Community-Articles/Incrementally-Streaming-RDBMS-Data-to-Your-Hadoop-DataLake/ta-p/247927 https://community.cloudera.com/t5/Community-Articles/Ingesting-and-Analyzing-Street-Camera-Data-from-Major-US/ta-p/249194 https://community.cloudera.com/t5/Community-Articles/Basic-Image-Processing-and-Linux-Utilities-As-Part-of-a-Big/ta-p/249121 https://community.cloudera.com/t5/Community-Articles/Hosting-and-Ingesting-Data-From-Web-Pages-Desktop-and-Mobile/ta-p/244575 https://community.cloudera.com/t5/Community-Articles/QADCDC-Our-how-to-ingest-some-database-tables-to-Hadoop-Very/ta-p/245229 https://community.cloudera.com/t5/Community-Articles/Tracking-Air-Quality-with-HDP-and-HDF-Part-2-Indoor-Air/ta-p/249471 https://community.cloudera.com/t5/Community-Articles/Streaming-Ingest-of-Google-Sheets-with-HDF-2-0/ta-p/247764 https://community.cloudera.com/t5/Community-Articles/Ingesting-Golden-Gate-Records-From-Apache-Kafka-and/ta-p/247557 https://community.cloudera.com/t5/Community-Articles/Data-Processing-Pipeline-Parsing-PDFs-and-Identifying-Names/ta-p/249105 https://community.cloudera.com/t5/Community-Articles/Using-A-TensorFlow-quot-Person-Blocker-quot-With-Apache-NiFi/ta-p/248141 https://community.cloudera.com/t5/Community-Articles/Su-Su-Sussudio-Sudoers-Log-Parsing-with-Apache-NiFi/ta-p/249461 https://community.cloudera.com/t5/Community-Articles/Integrating-IBM-Watson-Machine-Learning-APIs-with-Apache/ta-p/247545 https://community.cloudera.com/t5/Community-Articles/Simple-Change-Data-Capture-CDC-with-SQL-Selects-via-Apache/ta-p/308376 https://community.cloudera.com/t5/Community-Articles/Deep-Learning-IoT-Workflows-with-Raspberry-Pi-MQTT-MXNet/ta-p/249456 https://community.cloudera.com/t5/Community-Articles/Parsing-Web-Pages-for-Images-with-Apache-NiFi/ta-p/248415 https://community.cloudera.com/t5/Community-Articles/Trigger-SonicPi-Music-Via-Apache-NiFi/ta-p/248587 https://community.cloudera.com/t5/Community-Articles/Using-Parsey-McParseFace-Google-TensorFlow-Syntaxnet-From/ta-p/246337 https://community.cloudera.com/t5/Community-Articles/Ingesting-osquery-Into-Apache-Phoenix-using-Apache-NiFi/ta-p/249308 https://community.cloudera.com/t5/Community-Articles/Converting-PowerPoint-Presentations-into-French-from-English/ta-p/248974 https://community.cloudera.com/t5/Community-Articles/Posting-Images-with-Apache-NiFi-1-7-and-a-Custom-Processor/ta-p/249017 https://community.cloudera.com/t5/Community-Articles/Parsing-Any-Document-with-Apache-NiFi-1-5-with-Apache-Tika/ta-p/247672
Events
https://attend.cloudera.com/ameropendatalakehousewithcdpon?lid=7vxyhds3tlv7
August 23, 2023: NYC. AI. https://www.aicamp.ai/event/eventdetails/W2023082314
September 26-27, 2023: Current Event. San Jose, California. https://www.confluent.io/events/current/
October 7-10, 2023: Halifax, CA. Community over Code. https://communityovercode.org/
October 8, 2023: Streaming Track, Room 102 https://communityovercode.org/schedule/#Oct8 https://communityovercode.org/schedule-list/#SG007 https://communityovercode.org/schedule-list/#SG011
October 10, 2023: Internet of Things Track, Room 109 https://communityovercode.org/schedule/#Oct10 https://communityovercode.org/schedule-list/#IOT001
October 18, 2023: 2-Hours to Data Innovation: Data Flow https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html
November 2, 2023: Evolve. NYC https://www.cloudera.com/about/events/evolve/new-york.html#register
November 8, 2023: Flink Forward, Seattle. https://www.flink-forward.org/seattle-2023
November 22, 2023: Big Data Conference. Hybrid
https://bigdataconference.eu/
Cloudera Events https://www.cloudera.com/about/events.html
More Events: https://www.linkedin.com/pulse/schedule-2023-tim-spann-/
Code
- https://github.com/tspannhw/FLaNK-Edge/tree/main
- https://github.com/tspannhw/FLaNK-Edge-Models
- https://github.com/tspannhw/FLaNK-HTAP
Tools
- https://github.com/fluent/fluent-bit
- https://github.com/jdb78/pytorch-forecasting
- https://www.symmetricds.org/
- https://github.com/cloudera/CML_AMP_MLflow_Tracking
- https://github.com/ogakulov/CML_AMP_Churn_Prediction_mlflow
- https://github.com/Wisser/Jailer
- https://flatdraw.com/
- https://google.github.io/typograms/#installation
- https://github.com/mukel/llama2.java
- https://meoweler.com/
- https://github.com/karpathy/llama2.c
- https://github.com/simonmesmith/agentflow
- https://stability.ai/blog/stablecode-llm-generative-ai-coding
- https://dukope.itch.io/lcd-please
- https://github.com/feldera/dbsp
- https://github.com/libsql/libsql
- https://github.com/xusenlinzy/api-for-open-llm
- https://platform.openai.com/docs/guides/embeddings/what-are-embeddings
- https://github.com/openai/chatgpt-retrieval-plugin
- https://www.pythongasm.com/build-gpt-powered-chatbots-around-enterprise-data-with-python
- https://github.com/langgenius/dify
- https://github.com/languagetool-org/languagetool
- https://github.com/alipay/fury
- https://github.com/morph-labs/rift
- https://github.com/teaxyz/cli
- https://github.com/Aiven-Open/sql-cli-for-apache-flink-docker/releases/tag/1.17.1
- https://github.com/Aiven-Open/jdbc-connector-for-apache-kafka
- https://towhee.io/tasks/detail/pipeline/retrieval-augmented-generation
- https://github.com/towhee-io/towhee
- https://github.com/jason-jz-zhu/databathing
- https://github.com/microsoft/Llama-2-Onnx
- https://www.mlexpert.io/prompt-engineering/langchain-quickstart
- https://foojay.io/today/pi4j-operating-system-for-java-development-on-raspberry-pi/
- https://www.cambioml.com/pykoi/
- https://github.com/BerriAI/litellm/blob/main/cookbook/proxy-server/readme.md
- https://litellm.ai/
- https://redpoint.metabaseapp.com/public/dashboard/5e802588-cc2c-489c-a2f3-283d6c3cd298
- https://github.com/HumanSignal/label-studio
- https://github.com/daefresh/awesome-data-temporality
© 2020-2023 Tim Spann
Subscribe to my newsletter
Read articles from Timothy Spann directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by