FLaNK AI for 11 March 2024

Timothy SpannTimothy Spann
2 min read

11-March-2024

image

This week I am doing an AI meetup on Monday and a conference talk on Friday.

FLaNK Stack Weekly

Tim Spann @PaaSDev

https://pebble.is/PaaSDev

https://vimeo.com/flankstack

https://www.youtube.com/@FLaNK-Stack

https://www.threads.net/@tspannhw

https://medium.com/@tspann/subscribe

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

https://ossinsight.io/analyze/tspannhw

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

image

This is Issue #128

https://github.com/tspannhw/FLiPStackWeekly

https://www.cloudera.com/solutions/dim-developer.html

New Releases

https://docs.cloudera.com/cem/2.1.2/installation/topics/cem-install-cem-cm.html https://docs.cloudera.com/cem/2.1.2/release-notes/topics/cem-whats-new.html

Articles

NiFi Parameter Providers https://medium.com/@tspann/utilizing-apache-nifi-parameter-providers-36cf60313d5e

Mixtral Generative Sparse Mixture of Experts in DataFlows https://medium.com/@tspann/mixtral-generative-sparse-mixture-of-experts-in-dataflows-59744f7d28a9

Building an LLM Bot for Meetups and Conference Interactivity https://medium.com/@tspann/building-an-llm-bot-for-meetups-and-conference-interactivity-c211ea6e3b61

Kafka for Edge AI: Jetson Nano https://medium.com/@tspann/kafka-for-edge-ai-on-jetson-nano-enabling-efficient-data-streaming-c5bb01ca0705

Streaming Street Cams to YoLo v8 with Python and NiFi to MinIO (S3) https://medium.com/@tspann/streaming-street-cams-to-yolo-v8-with-python-and-nifi-to-minio-s3-3277e73723ce

Using OLLAMA with Mistral and Apache NiFi https://medium.com/@tspann/using-ollama-with-mistral-and-apache-nifi-720c17f5ff12

Using Google Gemma https://medium.com/@tspann/google-gemma-for-real-time-lightweight-open-llm-inference-88efe98e580f

https://medium.com/@tspann/open-source-vision-servers-pre-reqs-be2559e3ef52

https://medium.com/@ageospatial/geoforge-geospatial-analysis-with-large-language-models-geollms-2d3a0eaff8aa

https://readwrite.com/the-nsa-list-of-memory-safe-programming-languages-has-been-updated/

https://www.anthropic.com/news/claude-3-family

https://medium.com/@tspann/septa-transit-real-time-81082878b485

https://www.infosecurity-magazine.com/news/worm-created-generative-ai-systems/

https://cldr-steven-matison.github.io/blog/SSB-Iceberg-Time-Travel/

https://blog.devgenius.io/langchain-vs-llamaindex-vs-haystack-0d12d25b189e

https://github.com/milvus-io/milvus-haystack

https://dev.to/sebastienblanc/java-genai-the-ultimate-developers-joy-with-quarkus-langchain4j-and-ollama-179e

https://towardsdatascience.com/deploying-llms-into-production-using-tensorrt-llm-ed36e620dac4

https://community.cloudera.com/t5/Community-Articles/Hugging-Face-Spaces-AMPs-Accelerate-ML-Projects/ta-p/384685

https://spectrum.ieee.org/prompt-engineering-is-dead

https://developer.nvidia.com/blog/detecting-real-time-waste-contamination-using-edge-computing-and-video-analytics/

https://developer.nvidia.com/blog/solve-complex-ai-tasks-with-leaderboard-topping-smaug-72b-from-nvidia-ai-foundation-models/

https://github.com/cldr-steven-matison/SSB-CDC-Demo

https://engineering.princeton.edu/news/2024/03/06/built-ai-chip-moves-beyond-transistors-huge-computational-gains

https://docs.vllm.ai/en/latest/getting_started/quickstart.html

https://thenewstack.io/why-large-language-models-wont-replace-human-coders/

https://thenewstack.io/the-rise-of-small-language-models/

https://towardsdatascience.com/visualize-your-rag-data-evaluate-your-retrieval-augmented-generation-system-with-ragas-fc2486308557

https://verse.systems/blog/post/2024-03-09-using-llms-to-generate-fuzz-generators/

https://engineeringblog.yelp.com/2024/03/building-data-abstractions-with-streaming-at-yelp.html?u

https://medium.com/intuit-engineering/building-a-flexible-platform-for-optimal-use-of-llms-33a389cedf49

https://blog.allegro.tech/2024/03/kafka-performance-analysis.html

https://medium.com/analytics-vidhya/postgresql-integration-with-jupyter-notebook-deb97579a38d

Videos

Streaming Traffic Cameras https://www.youtube.com/watch?v=85ECRGJBEQU&ab_channel=DatainMotion-HowToBeaStreamingEngineer

Joining Three Kafka Topics in Flink SQL https://youtu.be/NI2n7uQJiP0?si=0aAFrkhOdqzZKisw

Continouos SQL https://youtu.be/k1mANc88OJc?si=o--ysshxFPem4Cze

CDF https://youtu.be/Z1IZ7uK_76s?si=XjlmcTQhwQ8F8aD0

Feb 22, 2024 NYC Meetup

https://www.slideshare.net/slideshows/2024-feb-ai-meetup-nyc-genaillmsmldata-codeless-generative-ai-pipelines/266444687

https://www.slideshare.net/slideshows/2024-february-28-nyc-meetup-unlocking-financial-data-with-realtime-pipelines/266539528

Feb 29, 2024 Conf42 Python 2024

https://www.slideshare.net/slideshows/conf42python-using-apache-nifi-apache-kafka-risingwave-and-apache-iceberg-with-stock-data-and-llm/266521940

https://www.slideshare.net/slideshows/conf42pythonbuilding-apache-nifi-20-python-processors/266522007

https://www.youtube.com/watch?v=awxzG7laWx4&ab_channel=Conf42

https://www.youtube.com/watch?v=FD16_oZ65Ug&ab_channel=Conf42

Weird

Encrypt a message until some date in the future. https://timelock.dev/

Events

March 11, 2024: Princeton. Meetup. GenAI. https://www.meetup.com/applied-generative-artificial-intelligence-applications/ https://23orchard.com/ https://www.startupgrind.com/events/details/startup-grind-princeton-presents-ignite-change-build-generative-ai-for-non-profits/

March 15, 2024: TCF Pro. Princeton, NJ. IT Professional Conference at Trenton Computer Festival IEEE Information Technology Professional Conference on Friday, March 15th, 2024 https://princetonacm.acm.org/tcfpro/

March 27, 2024: Startup Grind. Jersey City https://www.startupgrind.com/events/details/startup-grind-princeton-presents-startup-grind-princeton-amp-nj-big-data-alliance-generative-ai-reverse-pitch/

March 28, 2024: Pinot + NiFi + Flink + Kafka Meetup NYC https://www.meetup.com/real-time-analytics-meetup-ny/events/299290822/

April 2, 2024: XtremeJ 2024. Virtual. https://xtremej.dev/2023/schedule/

April 8-11, 2024: NLIT Summit. Seattle. https://www.fbcinc.com/e/nlit/default.aspx image

April 11, 2024: Conf42 LLM. Virtual. https://www.conf42.com/llms2024

April 2024: AI Meetup NJ https://www.meetup.com/nj-gai/

May 8-9, 2024: Data Summit 2024. Boston, MA. https://www.dbta.com/DataSummit/2024/default.aspx

Cloudera Events https://www.cloudera.com/about/events.html

More Events: https://www.linkedin.com/pulse/schedule-2024-tim-spann--y4coe

Code

  • https://github.com/tspannhw/FLaNK-python-processors
  • https://github.com/tspannhw/FLaNK-IceIceData
  • https://github.com/tspannhw/PaK-Stocks
  • https://github.com/tspannhw/meetups/tree/main/28feb2024
  • https://github.com/SuperEllipse/LLM-demo-on-CML

Models

  • https://github.com/salesforce/LAVIS
  • https://github.com/lm-sys/FastChat
  • https://huggingface.co/Salesforce/blip-image-captioning-large
  • https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

Tools

  • https://github.com/BatsResearch/bonito
  • https://github.com/mini-sora/minisora
  • https://github.com/voxel51/fiftyone
  • https://github.com/kadjoudi/Fraud-Prevention-With-Cloudera-SSB
  • https://cldr-steven-matison.github.io/blog/SSB-Dead-Letter-Queue/
  • https://sql-workbench.com/
  • https://github.com/anitagraser/movement-analysis-tools
  • https://github.com/ni1o1/transbigdata
  • https://github.com/xoolive/traffic
  • https://ipyleaflet.readthedocs.io/en/latest/
  • https://altair-viz.github.io/
  • https://github.com/InsightLab/PyMove
  • https://ollama.com/library/starcoder2
  • https://github.com/ubicloud/ubicloud
  • https://www.waitingforcode.com/apache-flink/apache-flink-input-data-reading/read
  • https://tunnelbroker.net/
  • https://github.com/basecamp/kamal
  • https://fmcheatsheet.org/
  • https://github.com/allenai/wimbd
  • https://github.com/linkedin/openhouse
  • https://github.com/Data-Provenance-Initiative/Data-Provenance-Collection
  • https://www.dataprovenance.org/
  • https://github.com/rom1504/clip-retrieval
  • https://github.com/allenai/wimbd
  • https://github.com/lmmlzn/awesome-llms-datasets
  • https://github.com/NVIDIA/TensorRT-LLM
  • https://pigsty.io/
  • https://github.com/dalibo/pg_activity
  • https://github.com/dalibo/temboard
  • https://github.com/daytonaio/daytona
  • https://github.com/abetlen/llama-cpp-python
  • https://github.com/PKU-YuanGroup/Open-Sora-Plan
  • https://github.com/gptscript-ai/gptscript
  • https://huggingface.co/docs/transformers/en/model_doc/mixtral
  • https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
  • https://www.microsoft.com/en-us/research/blog/autogen-enabling-next-generation-large-language-model-applications/
  • https://github.com/cloudera/CML_AMP_Deploy-Mistral7B-CML-Native-Model
  • https://hertzbeat.com/
  • https://github.com/LibrePDF/OpenPDF
  • https://github.com/HeyPuter/puter
  • https://github.com/vllm-project/vllm
  • https://github.com/SillyTavern/SillyTavern
  • https://github.com/datastax/ragstack-ai
  • https://github.com/KhoomeiK/LlamaGym
  • https://github.com/gingerbeardman/mandala
  • https://www.usebruno.com/
  • https://openddl.org/
  • https://blog.research.google/2024/03/croissant-metadata-format-for-ml-ready.html
  • https://stackoverflow.blog/2024/02/07/best-practices-for-building-llms/
  • https://developers.google.com/edu/python
  • https://github.com/Chleba/netscanner
  • https://github.com/run-llama/llama_parse
  • https://www.llamaindex.ai/blog/introducing-llamacloud-and-llamaparse-af8cedf9006b
  • https://babelfishpg.org/getstarted/
  • https://github.com/jdubois/2024-LangChain4J-demo
  • https://github.com/tufin/oasdiff

© 2020-2024 Tim Spann

0
Subscribe to my newsletter

Read articles from Timothy Spann directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Timothy Spann
Timothy Spann