FLaNK Stack Weekly for 19 February 2024
19-February-2024
Monday Feb 19, 2024 is Presidents Day
FLaNK Stack Weekly
Tim Spann @PaaSDev
https://pebble.is/PaaSDev
https://vimeo.com/flankstack
https://www.youtube.com/@FLaNK-Stack
https://www.threads.net/@tspannhw
https://medium.com/@tspann/subscribe
Get your new Apache NiFi for Dummies!
https://www.cloudera.com/campaign/apache-nifi-for-dummies.html
https://ossinsight.io/analyze/tspannhw
Trial: https://console.us-west-1.cdp.cloudera.com/trial/register.html#/
CODE + COMMUNITY
Please join my meetup group NJ/NYC/Philly/Virtual.
http://www.meetup.com/futureofdata-princeton/
https://www.meetup.com/futureofdata-newyork/
https://www.meetup.com/futureofdata-philadelphia/
This is Issue #125
https://github.com/tspannhw/FLiPStackWeekly
https://www.cloudera.com/solutions/dim-developer.html
Articles
NYC Traffic?? (NiFi, Kafka, Flink) https://medium.com/@tspann/nyc-traffic-are-you-kidding-me-6d3fa853903b
Subways and Transit Updates in Real-Time https://medium.com/@tspann/subways-and-transit-updates-in-real-time-30c104c359ef
Open Source Data Infrastructure Meetup - Feb 2024 https://medium.com/@tspann/open-source-data-infrastructure-meetup-feb-2024-9e8048666828
Catalogs in Flink SQL: A Primer https://www.decodable.co/blog/catalogs-in-flink-sql-a-primer
https://www.wired.com/story/goody-2-worlds-most-responsible-ai-chatbot/
https://lilianweng.github.io/posts/2024-02-05-human-data-quality
https://eugeneyan.com/writing/synthetic/
https://www.alexmolas.com/2024/02/05/a-search-engine-in-80-lines.html?
https://vectorize.io/2024/01/25/openai-text-embedding-3-embedding-models-first-look/
https://txt.cohere.com/aya/
https://www.infoq.com/presentations/virtual-threads-lightweight-concurrency/
https://medium.com/@james.li/how-to-visualise-real-time-order-book-data-and-host-your-own-dashboard-part-1-2-c77aa0fc5f59
https://docs.coinapi.io/how-to-guides/real-time-data-visualization-with-javascript
https://evidentinsights.com/ai-index/
https://fmirkes.github.io/articles/20190827.html
https://blog.dagworks.io/p/using-ipython-jupyter-magic-commands
https://medium.com/practice-in-public/these-words-make-it-obvious-that-your-text-is-written-by-ai-9b04f399d88c
https://technology.amis.nl/big-data-database/apache-nifi-forwarding-http-headers/
https://medium.com/@masreis/text-extraction-and-ocr-with-apache-tika-302464895e5f
Videos
Unlocking Financial Data with Real-Time Pipelines (OSACon 2023) https://www.youtube.com/watch?v=Q7gF7m4yFi4&ab_channel=OSACon
The Never Landing Stream https://www.youtube.com/watch?v=M8Bp0tRGvV0
Tips
https://community.cloudera.com/t5/Support-Questions/Apache-NiFi-to-split-incoming-data-from-a-file-based-on/m-p/220283
February 8, 2024 Meetup
https://www.slideshare.net/slideshows/ny-open-source-data-meetup-feb-8-2024-building-realtime-pipelines-with-flank-a-case-study-with-transit-data/266227433
Events
Feb 2024: Webinar https://www.cloudera.com/about/events/webinars/stay-ahead-of-cyber-threats-by-utilizing-data-in-motion.html?utm_medium=virtual-event&utm_source=resources-module&keyplay=ALL&utm_campaign=FY25-Q1-CorporateWebinar-AMER-cyber-threats&cid=701Hr000001pXCQIA2
Feb 20, 2024: 12-1PM EST. Virtual. Azure Data Tech Groups: DBA Fundamentals Group https://www.meetup.com/dba-fundamentals-group/events/296855261/
Feb 22, 2024: NYC. AI Camp Meetup. https://www.aicamp.ai/event/eventdetails/W2024022214
Feb 28, 2024: NYC. Cloudera Meetup. Flink https://www.meetup.com/futureofdata-princeton/events/298661947/
Feb 29, 2024: Virtual. Conf42 Python. https://www.conf42.com/Python_2024_Tim_Spann_apache_nifi_2_processors
https://www.conf42.com/Python_2024_Karin_Wolok_nifi__kafka_risingwave_iceberg_llm
Soon, 2024: Princeton. TigerLabs New Location. Meetup. GenAI. https://www.meetup.com/applied-generative-artificial-intelligence-applications/
March 15, 2024: TCF Pro. Princeton, NJ. IT Professional Conference at Trenton Computer Festival IEEE Information Technology Professional Conference on Friday, March 15th, 2024 https://princetonacm.acm.org/tcfpro/
April 2024: XtremeJ 2024. Virtual. https://xtremej.dev/2023/schedule/
April 11, 2024: Conf42 LLM. Virtual. https://www.conf42.com/llms2024
May 8-9, 2024: Data Summit 2024. Boston, MA. https://www.dbta.com/DataSummit/2024/default.aspx
Cloudera Events https://www.cloudera.com/about/events.html
More Events: https://www.linkedin.com/pulse/schedule-2024-tim-spann--y4coe
Code
- https://github.com/tspannhw/FLaNK-python-watsonx-processor
- https://github.com/thammuio/doc-genius-ai
- https://github.com/tspannhw/FLaNK-python-processors
Models
- https://github.com/ncbi/GeneGPT
- https://www.arxiv.org/abs/2402.03405
- https://huggingface.co/foduucom/stockmarket-pattern-detection-yolov8
Tools
- https://www.thestack.technology/apple-quietly-open-sources-key-ai-tools
- https://arxiv.org/abs/2312.11805?utm_source=substack&utm_medium=email
- https://bigscience.huggingface.co/blog/bloom?utm_source=substack&utm_medium=email
- https://huggingface.co/blog/2023-in-llms?utm_source=substack&utm_medium=email
- https://killedbyapixel.github.io/TinyCode/games/CrossMyHeart/
- https://github.com/datadreamer-dev/DataDreamer
- https://github.com/llmware-ai/llmware
- https://github.com/uclaml/SPIN
- https://medium.com/pinterest-engineering/running-unified-pubsub-client-in-production-at-pinterest-64ae2e721daa
- https://github.com/pinterest/psc
- https://arace.tech/products/radxa-rock-5a-pink-edition
- https://docs.mistral.ai/
- https://github.com/mistralai/
- https://github.com/YakrooThai/Air-Quality-Monitor/blob/main/pm25_simUart.py
- https://pkl-lang.org/main/current/introduction/use-cases.html
- https://datavolo.io/2024/02/how-to-rapidly-build-genai-chat-bots-in-slack/
- https://github.com/AI-Northstar-Tech/vector-io
- https://github.com/ocrmypdf/OCRmyPDF
- https://solr.apache.org/docs/9_5_0/changes/Changes.html#v9.5.0.new_features
- https://www.infoq.com/presentations/PRQL/
- https://prql-lang.org/
- https://github.com/kubewharf/kubeadmiral
- https://github.com/EladLeev/KeyToField-smt
- https://qdrant.github.io/fastembed/
- https://lolpop.readthedocs.io/en/latest/
- https://github.com/intel/openvino-plugins-ai-audacity/releases
- https://github.com/intel/cve-bin-tool
- https://github.com/intel/openvino-ai-plugins-gimp
- https://benyaakobi.medium.com/nifi-scheduling-a522a1c9e740
- https://github.com/segmind/segmoe
- https://github.com/tensorflow/gnn
- https://github.com/apache/nifi/pull/8152
- https://www.opencv.ai/blog/how-computer-vision-makes-people-look-more-attractive
- https://github.com/intel-retail/automated-self-checkout/tree/main
- https://github.com/openvinotoolkit/openvino_notebooks
- https://github.com/reorproject/reor
- https://github.com/traceloop/openllmetry-js
- https://xon.sh/
- https://cohere.com/research/aya
- https://github.com/AugustDev/enchanted
- https://github.com/NVIDIA/TensorRT-LLM
- https://hamel.dev/blog/posts/prompt/
- https://datatables.net/extensions/colreorder/
- https://github.com/metavoiceio/metavoice-src
- https://github.com/time-series-foundation-models/lag-llama
- https://github.com/mwmbl/mwmbl
- https://github.com/johannschopplich/unlazy
- https://github.com/kubetail-org/kubetail
- https://github.com/inoxlang/inox
- https://github.com/lanmaster53/recon-ng
- https://openai.com/sora
- https://plotly.com/javascript/
- https://dstack.ai/
- https://github.com/valine/NeuralFlow
- https://github.com/timeplus-io/proton
- https://beta.tauri.app/guides/create/
- https://github.com/5starkarma/face-smoothing
- https://github.com/aws/event-ruler
- https://notebooklm.google/
- https://github.com/mrn-aglic/apache-iceberg-data-exploration
- https://github.com/JohnSnowLabs/spark-nlp-workshop/tree/master/tutorials/Certification_Trainings
- https://blog.stackademic.com/building-a-video-games-recommender-system-with-milvus-fastapi-and-docker-5c0475fa2b06
- https://docs.ultralytics.com/models/yolo-world/#predict-usage
- https://github.com/AILab-CVC/YOLO-World
- https://github.com/TabbyML/tabby
- https://github.com/facebookresearch/jepa
- https://docs.spring.io/spring-ai/reference/api/clients/huggingface.html
- https://chat.lmsys.org/
- https://blog.min.io/minio-events-with-apache-nifi/
- https://github.com/LargeWorldModel/LWM
- https://nvidia.github.io/NeMo/blogs/2024/2024-02-canary/#additional-resources
- https://funnifi.blogspot.com/2023/04/transform-json-string-field-into-record.html
- https://github.com/google/magika
- https://tools.suckless.org/sent/
- https://ines.io/blog/beginners-guide-beautiful-slides-talks/
© 2020-2024 Tim Spann
Subscribe to my newsletter
Read articles from Timothy Spann directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by