Hacker Timesnew | past | comments | ask | show | jobs | submit | zacksiri's commentslogin

I tested the model in an agentic workflow. Here is the report:

https://upmaru.com/llm-tests/simple-tama-agentic-workflow-q1...


Seems like it does quite well on that particular benchmark?

It's ok, it's not the best. There are models that do better, I'd use it for some basic tasks but not actual complex tasks like query generation and retrieval.

I tested this model in an agentic workflow, it failed at some very basic tasks:

https://upmaru.com/llm-tests/simple-tama-agentic-workflow-q1...


Yes, my workflows use caching intensively. It's the only way to keep things fast / economical.


This is going to be a fun one to play with. I've been conducting tests on various models for my agentic workflow.

I was just wishing they would make a new flash-lite model, these things are so fast. Unfortunately 2.5-flash and therefore 2.5-flash-lite failed some of my agentic workflows.

If 3.1-flash-lite can do the job, this solves basically all latency issues for agentic workflows.

I publish my benchmarks here in case anyone is interested:

https://upmaru.com/llm-tests/simple-tama-agentic-workflow-q1...

P.S: The pricing bump is quiet significant, but still stomachable if it performs well. It is significant though.


Just released API Access for my agentic movie search product. So companies can build smart search into their streaming app / tracker apps:

Here is a demo link:

https://memovee.com/platform/demo?guest_account_id=019c481b-...

Try queries like:

- "Top 10 movies of 2024, sort by highest rating first"

- "Top 10 zombie apocalypse movies"

- "Find me some good movies that take place in space, no horrors please"

- "Some good movies that will make me appreciate life"

- "Find me movies like Bladerunner"

Or whatever else you can think of. You can also tell it to "filter out movies with less than 300 votes sort by highest rating first" etc...


I never thought of Elasticsearch as a database and always designed systems around what elasticsearch is supposed to be an index based document store for used with search.

I think their API is great and have had amazing results with it. Their recent innovations around quantization (bbq) has been amazing for my use case building an agentic movie database for discovering movies and personalized movie recommendations.

There are benefits to not using your database for everything, even if it adds a bit of complexity by introducing another dependency. If the benefits out weigh the cost of complexity reaching for elastic has almost always been worth it for me.


https://memovee.com - Agentic movie database

https://zacksiri.dev - My Blog


I feel for this guy, I really do.

I see comments like "is this a request to bypass sanctions" OR "he's iranian"

Let's remind ourselves of the following:

- First understand that he didn't choose to be born and raised in Iran.

- Second people grow up have families become attached to where they're born it's not easy to just 'pick up and leave' moving to a new country is expensive and extremely difficult especially from countries like Iran.

- Third he's building something he believes in which is probably better than most people who live in privileged countries who sit around and do nothing.

To me this reads like a plea for help.

He's built something and showing it to the world, if someone likes it and wants to fund him / get him out of Iran, so he can pursue his dreams AND have the people who help him benefit along with him. I'm sure he'll be all for that.


That might all be true, but it doesn't change the reality of the situation.

Through no fault of his own, he's in a shitty situation. Failing to acknowledge that helps nobody.


No one is failing to acknowledge that he's in a shitty situation. That's precisely the point of my post. It also doesn't mean he cannot be helped.


Having empathy for people is good.

Letting that empathy lead you into violating international sanctions to funnel money to a sanctioned country is decidedly not a good idea.

Requesting that other people sign up for Stripe accounts and send you the money in violation of sanctions is bad. The other person risks extreme legal consequences including prison sentences for deliberately and openly violating sanctions.

The comments are a warning to anyone who might feel compelled to do what is being asked, without realizing the specific request is a serious legal matter that can come with prison terms measured in decades.

> I see comments like "is this a request to bypass sanctions"

Did you read the Gist he posted? It was a direct request for someone to help him bypass sanctions.


You obviously didn't read my comment about someone helping him get out of Iran.

IF he's out of Iran, works in a company with a founder or investor he's not violating any laws.


I read your comment.

Immigrating to a new country isn't something you do on a whim. It takes a very long time (years) and it's not really cheap. It's not a solution to the OP's problem.

The OP wasn't asking for help leaving their country. They were making a specific ask to violate sanctions.


Been building an agentic movie database. https://memovee.com already in private testing.

The code for the agent is here https://github.com/upmaru/memovee-tama


This work is extremely consequential. When building agentic systems determinism will significantly improve the reliability.

I hope all the model providers adopt this.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: