- The Geek Labs
- Posts
- [#7] The best AI tools you can not miss
[#7] The best AI tools you can not miss
Curious About Which LLM to Choose or How to Rapidly Build AI-Powered Websites?Meet Braintrust.dev, v0.dev, and Manus.im, along with the revolutionary MCP standard, transforming AI development
Hey there,
It's been a wild week in AI, and I've got some juicy insights to share with you. If you're building in this space (or just trying to keep up), you'll want to read this one carefully.
Top AI Tools This Week
Before diving into the deep stuff, here's a quick rundown of the most interesting AI tools I've been playing with:
Braintrust.dev - Finally, a way to objectively measure which LLM actually performs best for your specific use case
v0.dev - Build a fully functional website in minutes with just text prompts (more on this below)
Manus.im - The autonomous agent that's making me rethink what AI can do without human intervention
MCP Servers - Connect your AI to virtually any external service with this new protocol
I've gone hands-on with all of these, and they're genuinely changing how I think about building with AI.
The AI Landscape Just Got More Interesting
So I was talking to a founder last week who's been struggling with evaluating which LLM to use in their product. They've been manually testing different models, spending hours writing prompts, and still not getting consistent results.
Sound familiar?
Here's the thing: Model evaluations are not prompt engineering. They're a systematic approach to comparing various LLMs used in AI applications.
Companies like Braintrust.dev, Confident-AI, and Noveum.ai are building specialized platforms that help you objectively measure model performance across different dimensions:
Accuracy and factuality
Reasoning capabilities
Instruction following
Safety and bias mitigation
Cost-performance ratio
Instead of guessing which model works best, these platforms let you run standardized tests across multiple models simultaneously. You can see exactly how Claude 3.5 Sonnet compares to GPT-4o for your specific use case, with actual metrics rather than vibes.
The best part? You can integrate these evaluations into your CI/CD pipeline, so you'll know immediately if a model update breaks your application.
If you're building anything serious with AI, you need this in your toolkit.
The "USB-C for AI" That's Bringing Fierce Rivals Together
Have you heard about MCP (Model Context Protocol) yet? If not, you're missing out on one of the most significant developments in AI infrastructure.
MCP is an open standard that allows AI models to connect with external data sources and services without requiring unique integrations for each service. Think of it as the "USB-C for AI" – a universal connector that standardizes how AI models interact with the world around them.
What's fascinating is that this protocol, initially developed by Anthropic, has gained support from major competitors including OpenAI and Microsoft. When fierce rivals come together around a technical standard, you know it's important.
Here's why MCP matters:
It enables smaller, more efficient AI systems that interact fluidly with external resources
It reduces vendor lock-in, allowing companies to switch AI providers while keeping the same tools
It's creating a thriving ecosystem with over 300 open-source servers already available
The top MCP servers right now include:
File System MCP Server (for local file access)
GitHub MCP Server (for code repositories)
Slack MCP Server (for team communications)
Google Maps MCP Server (for location data)
Brave Search MCP Server (for web search)
If you're building AI applications, MCP should be on your radar. It's going to fundamentally change how we architect AI systems.
Manus.im Is Changing the Game (And I'm Not Exaggerating)
I've been testing Manus for the past few weeks, and it's genuinely impressive. If you haven't heard of it yet, Manus is a general AI agent platform from China that's been spreading like wildfire globally.
Unlike traditional AI chatbots, Manus uses multiple AI models (including Claude 3.7 Sonnet and fine-tuned versions of Alibaba's Qwen) working together with various independently operating agents to act autonomously on a wide range of tasks.
What sets it apart is the "Manus's Computer" window that lets you observe what the agent is doing and intervene at any point. It feels like watching a highly intelligent intern work on your behalf.
I've used it for:
Creating comprehensive research reports
Analyzing complex datasets
Building interactive educational content
Comparing insurance policies
The results have been consistently impressive, though not perfect. It occasionally lacks understanding of tasks, makes incorrect assumptions, or cuts corners. But it explains its reasoning clearly and improves substantially with feedback.
Manus recently launched two subscription plans:
$39/month: 3,900 credits and ability to run two tasks simultaneously
$199/month: 19,900 credits, five simultaneous tasks, and priority access
If you're looking to automate complex tasks, Manus is worth checking out. Just be prepared for occasional system instability and captcha challenges when it tries to access paywalled content.
The Website Builder That's Making Developers 10x More Productive
If you're building websites or web applications, v0.dev from Vercel is a game-changer. It's an AI-powered UI generation tool that lets you describe what you want in natural language and generates production-ready React and Tailwind CSS code in seconds.
I recently used it to build a subscription management dashboard that would typically take a full day to code from scratch. With v0.dev, I had it up and running in 15 minutes.
The tool can:
Generate entire pages or individual components
Import designs directly from Figma
Create everything from landing pages to full-stack apps
Deploy directly to Vercel with one click
While it has limitations (especially for complex backend logic), it's perfect for rapid prototyping and building high-quality UIs without writing code from scratch.
Pricing starts with a free tier (limited messages per day), with premium plans at $20/month and $200/month for more intensive use.
If you're looking to speed up your development process or you're a non-developer who needs to create web interfaces, v0.dev is the best tool I've seen in this space.
API.market Hits a Major Milestone
Speaking of tools that make developers more productive, I'm excited to share that API.market has now crossed 4,000 users! This represents more than 5x growth since August 2024, when we had around 750 users.
If you're not familiar with API.market, it's a marketplace that connects API buyers and sellers across various categories, including AI, data, finance, geo, image processing, and more. It features verified providers like API League, MagicAPI, BridgeML, and AeroDataBox.
The platform offers both subscription-based access and one-time purchases, making it flexible for different needs. And with features like live API sandbox environments and encrypted API key management, it's designed to make integration as smooth as possible.
I'm particularly proud of this milestone because it validates the need for a dedicated API marketplace in today's development ecosystem. As AI and other technologies continue to advance, having easy access to specialized APIs becomes increasingly important.
Final Thoughts
The AI landscape is evolving at a dizzying pace, but these tools and platforms are making it more accessible and productive for builders at all levels.
Whether you're evaluating models with specialized platforms, connecting your AI to external services with MCP, automating complex tasks with Manus, building websites with v0.dev, or accessing specialized APIs through API.market, there's never been a better time to be building with AI.
What are you working on? Hit reply and let me know – I'm always curious to hear what this community is building.
Best Regards,
Shashank Agarwal
P.S. If you found this newsletter valuable, please forward it to a friend who might benefit from it. And if someone forwarded this to you, you can subscribe here to get future editions directly in your inbox.
P.S. Please mark this email as “Important” to prevent it from going to spam.
Reply