earn $

📺 screenpipe #007 | 10x better transcriptions, powerful meeting summaries

2024-09-185 min read

screenpipe #007

⬇️ download mac m app
⬇️ download windows app
give us a ⭐️(11.5k) to help the algorithm

hey louis & matt here

btw, in this newsletter, we share new features, bug fixes, work in progress, and what's next.

as a reminder, screenpipe is an open source library and an app to record data from your screens & mics and pass it into a llm. an ai powered by what you've seen, said, or heard:

✌️ highlights

our daily active users have grown by 100% in one week we crossed 100 forks on GitHub 2 b2b requests for whatsapp+screenpipe automation

🤩 new features

📝 meeting summary page

many of you asked better support for meeting summaries, so we added a page listing the calls you had today, including live transcription streaming:

meeting summary

🎙️ new voice activity detection

we just made voice activity detection much better, it should have much less noise like "thank you" and similar errors from local audio models, we will continue improving this! as always you can opt for a cloud version with deepgram model which has a higher quality

📺 screenshots in search results

by toggling "include frames", it will add the screenshots where the text appeared for OCR

search results

hey, you can also get it through the API:

curl "http://localhost:3030/search?limit=1&offset=0&content_type=ocr&include_frames=true&start_time=$(date -u -v-220M +%Y-%m-%dT%H:%M:%SZ)&end_time=$(date -u -v-120M +%Y-%m-%dT%H:%M:%SZ)" | jq -r '.data[0].content.frame' | base64 --decode > /tmp/frame.png && open /tmp/frame.png

what can you do with this? review source frames for specific details, send frames to vision models with custom prompts for productivity insights, task automation, visual search

🤖 select results for ai

LLMs cannot ingest large quantities of data, so now you can choose less context in your prompt. this means fewer tokens used, higher-quality results, and more control:

ai results

✨ minor features

  • ignore short content in search
  • use your own openai api compatible url (openai, ollama, openrouter, etc.)
  • select FPS in settings (useful to reduce resource usage or more fine-grained information)
  • max context in ai settings to adjust ui to different models
  • experimental api endpoint to merge your .mp4 videos (check server.rs)
  • icon next to search button to easily reproduce your search requests in the terminal or in code
  • /search now takes min_length and max_length args in order to reduce noise in results

🔨 bug fixes

  • fixed ollama having context only on the second message
  • fixed some audio devices not working on windows
  • include frames in the app now properly shows the frame where you got the result
  • had some issue with deepgram in the middle of the week - fixed now, also deepgram should work with all audio devices now
  • sometimes recording process would stay in the background, now, when you don't use dev mode, the recording process will "auto destruct" when the app is closed

🔮 upcoming

  • multimodal LLM in the chat (e.g. select a part of your history and feed a few frames in addition of the OCR & audio for more detailed context)
  • integration with whatsapp, including transferring all your whatsapp data + automatic scraping of new messages into screenpipe
  • daily summary feature to help you write a report to your manager
  • installation, stability & performance cross-platform, fix known issues
  • search: more relevance, less redundant information
  • new models: brand new local speech to text model - silero
  • new models: add new ocr engine (esp. for linux)
  • security: mp4 encryption at rest
  • storage: reduce storage required by half using h265 encoding
  • ux: automatic audio device switch
  • ux: faster interface, easier to ask/search through shortcuts, etc.
  • extensions: plenty of plugins you can install in a click (or build yourself) to get the most out of your data
  • extensions: use native api (automate your computer, display on the screen, etc.), real-time data streaming, high level abstraction in pipes

🙏 ask

hey, your feedback and support are super valuable to us, hit the reply button and tell us what you'd like to see in screenpipe.

like screenpipe? mention it online, it would help us grow! 🚀

you can tag screen_pipe on x and we will retweet you ❤️

  • the app is still in alpha and we've fixed tons of bugs, however, we're releasing daily updates to fix them, along with new features. we're a two-person team, but we have open source contributors, and we would be happy to welcome more of them! ☺️🙏

links

take care,
screenpipe

follow us:
x
youtube
discord

wanna chat?

You are receiving this email because you opted-in to receive updates from Mediar, Inc
Mediar, Inc, 2 Marina Blvd B300, San Francisco, CA 94123