FLaNK Stack Weekly for 19 February 2024

Timothy SpannTimothy Spann
2 min read

19-February-2024

Monday Feb 19, 2024 is Presidents Day

FLaNK Stack Weekly

Tim Spann @PaaSDev

https://pebble.is/PaaSDev

https://vimeo.com/flankstack

https://www.youtube.com/@FLaNK-Stack

https://www.threads.net/@tspannhw

https://medium.com/@tspann/subscribe

Get your new Apache NiFi for Dummies!

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

https://ossinsight.io/analyze/tspannhw

Trial: https://console.us-west-1.cdp.cloudera.com/trial/register.html#/

Building Realtime AI Applications with Apache Flink

image

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

image

This is Issue #125

https://github.com/tspannhw/FLiPStackWeekly

https://www.cloudera.com/solutions/dim-developer.html

Articles

NYC Traffic?? (NiFi, Kafka, Flink) https://medium.com/@tspann/nyc-traffic-are-you-kidding-me-6d3fa853903b

Subways and Transit Updates in Real-Time https://medium.com/@tspann/subways-and-transit-updates-in-real-time-30c104c359ef

Open Source Data Infrastructure Meetup - Feb 2024 https://medium.com/@tspann/open-source-data-infrastructure-meetup-feb-2024-9e8048666828

Catalogs in Flink SQL: A Primer https://www.decodable.co/blog/catalogs-in-flink-sql-a-primer

https://www.wired.com/story/goody-2-worlds-most-responsible-ai-chatbot/

https://lilianweng.github.io/posts/2024-02-05-human-data-quality

https://eugeneyan.com/writing/synthetic/

https://www.alexmolas.com/2024/02/05/a-search-engine-in-80-lines.html?

https://vectorize.io/2024/01/25/openai-text-embedding-3-embedding-models-first-look/

https://txt.cohere.com/aya/

https://www.infoq.com/presentations/virtual-threads-lightweight-concurrency/

https://medium.com/@james.li/how-to-visualise-real-time-order-book-data-and-host-your-own-dashboard-part-1-2-c77aa0fc5f59

https://docs.coinapi.io/how-to-guides/real-time-data-visualization-with-javascript

https://evidentinsights.com/ai-index/

https://fmirkes.github.io/articles/20190827.html

https://blog.dagworks.io/p/using-ipython-jupyter-magic-commands

https://medium.com/practice-in-public/these-words-make-it-obvious-that-your-text-is-written-by-ai-9b04f399d88c

https://technology.amis.nl/big-data-database/apache-nifi-forwarding-http-headers/

https://medium.com/@masreis/text-extraction-and-ocr-with-apache-tika-302464895e5f

Videos

Unlocking Financial Data with Real-Time Pipelines (OSACon 2023) https://www.youtube.com/watch?v=Q7gF7m4yFi4&ab_channel=OSACon

The Never Landing Stream https://www.youtube.com/watch?v=M8Bp0tRGvV0

Tips

https://community.cloudera.com/t5/Support-Questions/Apache-NiFi-to-split-incoming-data-from-a-file-based-on/m-p/220283

February 8, 2024 Meetup

https://www.slideshare.net/slideshows/ny-open-source-data-meetup-feb-8-2024-building-realtime-pipelines-with-flank-a-case-study-with-transit-data/266227433

Events

Feb 2024: Webinar https://www.cloudera.com/about/events/webinars/stay-ahead-of-cyber-threats-by-utilizing-data-in-motion.html?utm_medium=virtual-event&utm_source=resources-module&keyplay=ALL&utm_campaign=FY25-Q1-CorporateWebinar-AMER-cyber-threats&cid=701Hr000001pXCQIA2

Feb 20, 2024: 12-1PM EST. Virtual. Azure Data Tech Groups: DBA Fundamentals Group https://www.meetup.com/dba-fundamentals-group/events/296855261/

Feb 22, 2024: NYC. AI Camp Meetup. https://www.aicamp.ai/event/eventdetails/W2024022214

Feb 28, 2024: NYC. Cloudera Meetup. Flink https://www.meetup.com/futureofdata-princeton/events/298661947/

Feb 29, 2024: Virtual. Conf42 Python. https://www.conf42.com/Python_2024_Tim_Spann_apache_nifi_2_processors

https://www.conf42.com/Python_2024_Karin_Wolok_nifi__kafka_risingwave_iceberg_llm

Soon, 2024: Princeton. TigerLabs New Location. Meetup. GenAI. https://www.meetup.com/applied-generative-artificial-intelligence-applications/

March 15, 2024: TCF Pro. Princeton, NJ. IT Professional Conference at Trenton Computer Festival IEEE Information Technology Professional Conference on Friday, March 15th, 2024 https://princetonacm.acm.org/tcfpro/

April 2024: XtremeJ 2024. Virtual. https://xtremej.dev/2023/schedule/

April 11, 2024: Conf42 LLM. Virtual. https://www.conf42.com/llms2024

May 8-9, 2024: Data Summit 2024. Boston, MA. https://www.dbta.com/DataSummit/2024/default.aspx

Cloudera Events https://www.cloudera.com/about/events.html

More Events: https://www.linkedin.com/pulse/schedule-2024-tim-spann--y4coe

Code

  • https://github.com/tspannhw/FLaNK-python-watsonx-processor
  • https://github.com/thammuio/doc-genius-ai
  • https://github.com/tspannhw/FLaNK-python-processors

Models

  • https://github.com/ncbi/GeneGPT
  • https://www.arxiv.org/abs/2402.03405
  • https://huggingface.co/foduucom/stockmarket-pattern-detection-yolov8

Tools

  • https://www.thestack.technology/apple-quietly-open-sources-key-ai-tools
  • https://arxiv.org/abs/2312.11805?utm_source=substack&utm_medium=email
  • https://bigscience.huggingface.co/blog/bloom?utm_source=substack&utm_medium=email
  • https://huggingface.co/blog/2023-in-llms?utm_source=substack&utm_medium=email
  • https://killedbyapixel.github.io/TinyCode/games/CrossMyHeart/
  • https://github.com/datadreamer-dev/DataDreamer
  • https://github.com/llmware-ai/llmware
  • https://github.com/uclaml/SPIN
  • https://medium.com/pinterest-engineering/running-unified-pubsub-client-in-production-at-pinterest-64ae2e721daa
  • https://github.com/pinterest/psc
  • https://arace.tech/products/radxa-rock-5a-pink-edition
  • https://docs.mistral.ai/
  • https://github.com/mistralai/
  • https://github.com/YakrooThai/Air-Quality-Monitor/blob/main/pm25_simUart.py
  • https://pkl-lang.org/main/current/introduction/use-cases.html
  • https://datavolo.io/2024/02/how-to-rapidly-build-genai-chat-bots-in-slack/
  • https://github.com/AI-Northstar-Tech/vector-io
  • https://github.com/ocrmypdf/OCRmyPDF
  • https://solr.apache.org/docs/9_5_0/changes/Changes.html#v9.5.0.new_features
  • https://www.infoq.com/presentations/PRQL/
  • https://prql-lang.org/
  • https://github.com/kubewharf/kubeadmiral
  • https://github.com/EladLeev/KeyToField-smt
  • https://qdrant.github.io/fastembed/
  • https://lolpop.readthedocs.io/en/latest/
  • https://github.com/intel/openvino-plugins-ai-audacity/releases
  • https://github.com/intel/cve-bin-tool
  • https://github.com/intel/openvino-ai-plugins-gimp
  • https://benyaakobi.medium.com/nifi-scheduling-a522a1c9e740
  • https://github.com/segmind/segmoe
  • https://github.com/tensorflow/gnn
  • https://github.com/apache/nifi/pull/8152
  • https://www.opencv.ai/blog/how-computer-vision-makes-people-look-more-attractive
  • https://github.com/intel-retail/automated-self-checkout/tree/main
  • https://github.com/openvinotoolkit/openvino_notebooks
  • https://github.com/reorproject/reor
  • https://github.com/traceloop/openllmetry-js
  • https://xon.sh/
  • https://cohere.com/research/aya
  • https://github.com/AugustDev/enchanted
  • https://github.com/NVIDIA/TensorRT-LLM
  • https://hamel.dev/blog/posts/prompt/
  • https://datatables.net/extensions/colreorder/
  • https://github.com/metavoiceio/metavoice-src
  • https://github.com/time-series-foundation-models/lag-llama
  • https://github.com/mwmbl/mwmbl
  • https://github.com/johannschopplich/unlazy
  • https://github.com/kubetail-org/kubetail
  • https://github.com/inoxlang/inox
  • https://github.com/lanmaster53/recon-ng
  • https://openai.com/sora
  • https://plotly.com/javascript/
  • https://dstack.ai/
  • https://github.com/valine/NeuralFlow
  • https://github.com/timeplus-io/proton
  • https://beta.tauri.app/guides/create/
  • https://github.com/5starkarma/face-smoothing
  • https://github.com/aws/event-ruler
  • https://notebooklm.google/
  • https://github.com/mrn-aglic/apache-iceberg-data-exploration
  • https://github.com/JohnSnowLabs/spark-nlp-workshop/tree/master/tutorials/Certification_Trainings
  • https://blog.stackademic.com/building-a-video-games-recommender-system-with-milvus-fastapi-and-docker-5c0475fa2b06
  • https://docs.ultralytics.com/models/yolo-world/#predict-usage
  • https://github.com/AILab-CVC/YOLO-World
  • https://github.com/TabbyML/tabby
  • https://github.com/facebookresearch/jepa
  • https://docs.spring.io/spring-ai/reference/api/clients/huggingface.html
  • https://chat.lmsys.org/
  • https://blog.min.io/minio-events-with-apache-nifi/
  • https://github.com/LargeWorldModel/LWM
  • https://nvidia.github.io/NeMo/blogs/2024/2024-02-canary/#additional-resources
  • https://funnifi.blogspot.com/2023/04/transform-json-string-field-into-record.html
  • https://github.com/google/magika
  • https://tools.suckless.org/sent/
  • https://ines.io/blog/beginners-guide-beautiful-slides-talks/

© 2020-2024 Tim Spann

0
Subscribe to my newsletter

Read articles from Timothy Spann directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Timothy Spann
Timothy Spann