[#7] The best AI tools you can not miss

Curious About Which LLM to Choose or How to Rapidly Build AI-Powered Websites?Meet Braintrust.dev, v0.dev, and Manus.im, along with the revolutionary MCP standard, transforming AI development

Hey there,

It's been a wild week in AI, and I've got some juicy insights to share with you. If you're building in this space (or just trying to keep up), you'll want to read this one carefully.

Top AI Tools This Week

Before diving into the deep stuff, here's a quick rundown of the most interesting AI tools I've been playing with:

  1. Braintrust.dev - Finally, a way to objectively measure which LLM actually performs best for your specific use case

  2. v0.dev - Build a fully functional website in minutes with just text prompts (more on this below)

  3. Manus.im - The autonomous agent that's making me rethink what AI can do without human intervention

  4. MCP Servers - Connect your AI to virtually any external service with this new protocol

I've gone hands-on with all of these, and they're genuinely changing how I think about building with AI.

The AI Landscape Just Got More Interesting

So I was talking to a founder last week who's been struggling with evaluating which LLM to use in their product. They've been manually testing different models, spending hours writing prompts, and still not getting consistent results.

Sound familiar?

Here's the thing: Model evaluations are not prompt engineering. They're a systematic approach to comparing various LLMs used in AI applications.

Companies like Braintrust.dev, Confident-AI, and Noveum.ai are building specialized platforms that help you objectively measure model performance across different dimensions:

  • Accuracy and factuality

  • Reasoning capabilities

  • Instruction following

  • Safety and bias mitigation

  • Cost-performance ratio

Instead of guessing which model works best, these platforms let you run standardized tests across multiple models simultaneously. You can see exactly how Claude 3.5 Sonnet compares to GPT-4o for your specific use case, with actual metrics rather than vibes.

The best part? You can integrate these evaluations into your CI/CD pipeline, so you'll know immediately if a model update breaks your application.

If you're building anything serious with AI, you need this in your toolkit.

The "USB-C for AI" That's Bringing Fierce Rivals Together

Have you heard about MCP (Model Context Protocol) yet? If not, you're missing out on one of the most significant developments in AI infrastructure.

MCP is an open standard that allows AI models to connect with external data sources and services without requiring unique integrations for each service. Think of it as the "USB-C for AI" – a universal connector that standardizes how AI models interact with the world around them.

What's fascinating is that this protocol, initially developed by Anthropic, has gained support from major competitors including OpenAI and Microsoft. When fierce rivals come together around a technical standard, you know it's important.

Here's why MCP matters:

  1. It enables smaller, more efficient AI systems that interact fluidly with external resources

  2. It reduces vendor lock-in, allowing companies to switch AI providers while keeping the same tools

  3. It's creating a thriving ecosystem with over 300 open-source servers already available

The top MCP servers right now include:

If you're building AI applications, MCP should be on your radar. It's going to fundamentally change how we architect AI systems.

Manus.im Is Changing the Game (And I'm Not Exaggerating)

I've been testing Manus for the past few weeks, and it's genuinely impressive. If you haven't heard of it yet, Manus is a general AI agent platform from China that's been spreading like wildfire globally.

Unlike traditional AI chatbots, Manus uses multiple AI models (including Claude 3.7 Sonnet and fine-tuned versions of Alibaba's Qwen) working together with various independently operating agents to act autonomously on a wide range of tasks.

What sets it apart is the "Manus's Computer" window that lets you observe what the agent is doing and intervene at any point. It feels like watching a highly intelligent intern work on your behalf.

I've used it for:

  • Creating comprehensive research reports

  • Analyzing complex datasets

  • Building interactive educational content

  • Comparing insurance policies

The results have been consistently impressive, though not perfect. It occasionally lacks understanding of tasks, makes incorrect assumptions, or cuts corners. But it explains its reasoning clearly and improves substantially with feedback.

Manus recently launched two subscription plans:

  • $39/month: 3,900 credits and ability to run two tasks simultaneously

  • $199/month: 19,900 credits, five simultaneous tasks, and priority access

If you're looking to automate complex tasks, Manus is worth checking out. Just be prepared for occasional system instability and captcha challenges when it tries to access paywalled content.

The Website Builder That's Making Developers 10x More Productive

If you're building websites or web applications, v0.dev from Vercel is a game-changer. It's an AI-powered UI generation tool that lets you describe what you want in natural language and generates production-ready React and Tailwind CSS code in seconds.

I recently used it to build a subscription management dashboard that would typically take a full day to code from scratch. With v0.dev, I had it up and running in 15 minutes.

The tool can:

  • Generate entire pages or individual components

  • Import designs directly from Figma

  • Create everything from landing pages to full-stack apps

  • Deploy directly to Vercel with one click

While it has limitations (especially for complex backend logic), it's perfect for rapid prototyping and building high-quality UIs without writing code from scratch.

Pricing starts with a free tier (limited messages per day), with premium plans at $20/month and $200/month for more intensive use.

If you're looking to speed up your development process or you're a non-developer who needs to create web interfaces, v0.dev is the best tool I've seen in this space.

API.market Hits a Major Milestone

Speaking of tools that make developers more productive, I'm excited to share that API.market has now crossed 4,000 users! This represents more than 5x growth since August 2024, when we had around 750 users.

If you're not familiar with API.market, it's a marketplace that connects API buyers and sellers across various categories, including AI, data, finance, geo, image processing, and more. It features verified providers like API League, MagicAPI, BridgeML, and AeroDataBox.

The platform offers both subscription-based access and one-time purchases, making it flexible for different needs. And with features like live API sandbox environments and encrypted API key management, it's designed to make integration as smooth as possible.

I'm particularly proud of this milestone because it validates the need for a dedicated API marketplace in today's development ecosystem. As AI and other technologies continue to advance, having easy access to specialized APIs becomes increasingly important.

Final Thoughts

The AI landscape is evolving at a dizzying pace, but these tools and platforms are making it more accessible and productive for builders at all levels.

Whether you're evaluating models with specialized platforms, connecting your AI to external services with MCP, automating complex tasks with Manus, building websites with v0.dev, or accessing specialized APIs through API.market, there's never been a better time to be building with AI.

What are you working on? Hit reply and let me know – I'm always curious to hear what this community is building.

Best Regards,
Shashank Agarwal

P.S. If you found this newsletter valuable, please forward it to a friend who might benefit from it. And if someone forwarded this to you, you can subscribe here to get future editions directly in your inbox.

P.S. Please mark this email as “Important” to prevent it from going to spam.

Reply

or to participate.