Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
Show HN: Codeium: Free Copilot Alternative for Vim / Neovim (github.com/exafunction)
94 points by varunkmohan on Jan 18, 2023 | hide | past | favorite | 80 comments
I'm Varun from the Codeium team. After support for VSCode, Jetbrains, Jupyter, and Colab, we are super excited to bring free AI-powered code autocomplete to Vim and Neovim.

And in the spirit of Show HN, we have a playground version for anyone to try the tech in the browser without any installation (https://www.codeium.com/playground)! We also made the vim client open source and are open to contributions.



Be careful of this project.

I am analyzing their binary in a sandbox as I write this.

I was excited when I installed it a couple of hours ago. But suddenly my firewall notifications start piling up.

As soon as you install it, it downloads a large binary that includes the trained models and starts a web server on your computer. Then it phones home what they call "meta data", it keeps sending heart beat even if you opt out.

Basically the go binary on your machine is everything the tool needs. But before giving output it dials home.


Lovely. Just take the trained model and upload it to github. Let people use it without "dailing home". After all, OP mentioned in comments somewhere that they are doing it for "democratizing the technology". I cannot see a better way of democratizing. And ofcourse OP will have no problem with it because their goal was not to make money off it, I was told.


Democratizing by spying , sounds about right ;P

They should relase the current model and sell later updated models for a fee , that would be a good business idea.


Exactly.


yeah that kind of behavior , i had been seeing a lot in several opensource/binary projects. Mostly they are telemetry but some arent.


May I ask what sandbox software do you use?


Disgusting behavior.


Which languages do you support? My main work is in Elixir, where Copilot is incredibly weak. It’s completely understandable that less training data for less popular languages means worse performance, but it’d be interesting to hear whether you’re trying to alleviate that limitation for more niche languages somehow.


Elixir is not a language we are currently confident in, but in the very near future (O(weeks)), we believe we will have a model that is better than Copilot for Elixir and other niche languages.


If you are planning on supporting niche languages, then please support Crystal (https://crystal-lang.org/), I am currently using Copilot with it and it’s doing ok for writing tests, but is quite bad otherwise.


How is this free exactly? The cost to running a quality model is certainly not zero or even cheap. Is this explained anywhere?


It isn't zero, that's for sure! Before Codeium, we as a team were building scalable ML infrastructure for some of the world's largest ML workloads, so we have a lot of experience in building infra (especially ML serving infra) that optimizes computation on GPUs/mixed compute resources to drastically reduce serving costs. We will probably talk more about these infra side optimizations in future blog posts!


Not a very elegant way of tip-toeing around the question of "how do you finance this?/what's the catch?"


The catch is the code runs on your machine but the team doesn't tell you.


Yes, but that would actually be an awesome feature! If only it didn't call home so much to send unknown payloads...


What is your monetization strategy? What additional features are you planning that will generate revenue?


From their faq: https://www.codeium.com/faq

> How do you plan to make money?

> Fair question whenever anything is billed as free forever, especially when there are clear infrastructure serving costs as we do! To be clear, this free forever deal is in order to provide some special benefit to early adopters like yourself who have supported Codeium from the beginning. We are actively working on a bunch of exciting additional features that we may consider as part of a paid Pro subscription, but we are committed to offering state-of-the-art code completion forever for free to you and as cheaply as possible to every developer at large.


Trying this out on neovim. Is there a way to use another key combination to accept the auto completion instead of a Tab?

This is to avoid clashing with nvim-cmp's tab configuration. copilot plugin uses an alt-l combination to accept the suggestion for example


Thanks for the feedback. Currently there is not, but we will look into setting up custom keybindings or some other approach.


I will say this website is extremely responsive. I kept clicking back and forth on the nav links just because it loaded fast.


How does this work? Does it ship a model to me and do it all locally? I love Copilot (one of my favourite subscriptions) but I can't use it on a plane. Would love to do this and just toggle between the two.


No the model actually runs remotely. We would love to have some way to run a lightweight model on the user's machines but it's just not feasible given model size and amount of computation required.


So this sends code from the user's machine to your API?

This isn't disclosed on the linked page, nor is a privacy policy linked.


What are the current requirements to run that model?


nah. We know it's because if people had the model with them, they wouldn't pay you.


Have you looked at the size of these models these days? Do you have the compute to run a model that takes up 100s of gigs just in hard drive space?


yeah, i know, and I have a good enough system to run these models. Google "fauxpilot".


Do you find the quality comparable to copilot? That's still an order of magnitude fewer parameters than copilot uses?


We are a free product.


only for first few customers. After that you'll want new customers to open their wallets for you. You are not a charity. So cut the BS please.


I don't really want to argue but we have been pretty open about the fact that we want to keep this product free for users going forward as well. We have been pretty clear about our plan to monetize additional features but the focus now is to democratize this technology.


> we have been pretty open about the fact that we want to keep this product free for users going forward as well

I think that parent’s tone is way too combative. But the “Pricing” page on your website[1] only says that the service is free for early users.

  One Plan

  It's pretty simple. Early users get free forever access to Codeium.
It does not say whether all future users will have to pay or only those who want additional features. The wording strongly suggests the former.

[1] https://www.codeium.com/pricing


Recent and related:

Show HN: Codeium – a free, fast AI codegen extension - https://hackertimes.com/item?id=33885676 - Dec 2022 (39 comments)


Is there a free copilot alternative which can utilize SalesForce CodeGen model ? I looking to standup a service in my firm provide code generation capabilities.


Now port it to Emacs


I just ran a few experiments with data science workflows comparing Codeium to Copilot. For my tasks, Copilot performed noticeably better.


Does this use GPT-3 or something else on the background? If not, what is the source of the training data?


We don't rely on any 3rd party APIs in order to keep serving costs down (which is necessary to keep this as a free service). We only use permissively licensed public code to train.


What counts as permissively licensed? The FAQ is mealy-mouthed about this.


Reading the FAQ it seems to come down "people copy/paste from publicly available code anyway so who cares. We'll look into attribution in the future maybe".

I've never read an FAQ that avoids it's own questions this much. They say "permissive licenses", they could've just listed what licenses were deemed acceptable by their scraper. Licenses like WTFPL, MIT-0 or CC0 are perfectly usable for ML because they require nothing in exchange for the code. There are projects out there using these licenses, though most will require attribution at the very least.

Microsoft went full "licenses don't apply to machine learning, sue us", just like most AI companies when it comes to things like copyright, but I don't think claiming to care about licenses and then refusing to name any permissable licenses for your model is much better. At least MS doesn't lie about their intentions.


Indeed - I would argue it is _much_ worse.


He won't answer because he's following OpenAI's criminal footsteps.


What happened to operating in good faith on this platform? That's such a mean comment to make on such a lighthearted topic.


This person had ample opportunity to answer the question. But he didn't, just like the other thieves who don't answer questions about the legality of their datasets.

It's a pattern.


You're using words like 'criminal' and 'thieves', but I'm not aware of any criminal case, and certainly not any convictions in any jurisdiction. Did I just miss it?


You can be a criminal without being convicted.


Who decides?


Totally unbiased opinion from someone who sells their own GPT-like software.


There's no GPT in my software, nor does it work like it.

Anything else you care to be wrong about?


Oh, I see that you're actually suing GitHub and OpenAI, it all makes sense now. TOTALLY not biased in any way!


Congratulations! You're WRONG AGAIN! LoL

It's not my lawsuit :-)


"We’ve filed a law­suit chal­leng­ing GitHub Copi­lot" from your Twitter account, but okay.

https://twitter.com/JOSourcing/status/1588296106554716160?s=...


I'm sorry i didnt realize you were just quoting the page


Oh, and don't bother responding to me again.

I won't answer.


be explicit about it. what is "permissively licensed"?


MIT license still has the copyright notice requirement…


just as my copilot trial is expiring, nice. will plan to try this out tonight


Just out of curiosity, is there a reason that both this and copilot support Vim but not Emacs?


Because Vim is better than Emacs.


Heretic!


This is one of the next IDEs we will be supporting! We're keeping folks on our discord (https://discord.com/invite/3XFf78nAx5) updated on this.


You can use https://github.com/zerolfx/copilot.el to get Copilot completions in Emacs.

Video explainer: https://youtu.be/dZMGH_3UdSE


Is this OK? Autocomplete suggests the text with what appears to be someone's username.

https://freeimage.host/i/HaLPXDB


That's proof that this "programmer" is foolishly using scraped content as a dataset.

If you found that, you'll find all sorts of private information in there.

This post should be taken down.

The project is a security risk.


We are actively iterating on our data cleaning to deanonymize any user information. Of course some cases are tricky to find, but we recognize these issues are serious and will continue to work here. All of our training data is public, so we know that we won't ever produce private user info.


>We are actively iterating on our data cleaning to deanonymize any user information.

You're actively trying to doxx people?

Do you have any idea what the words you're using even mean??


If they are using a publicly available code, I don't think it is appropriate to accuse them of doxxing, anyone could just go to GitHub and find the snippet of code.


What does deanonymize mean?


Not to be confused with Codium [1], a completely FOSS, telemetry-free build of VS Code

https://vscodium.com/


Per the docs*, not completely telemetry-free, yet an admirable attempt.

* https://github.com/VSCodium/vscodium/blob/master/DOCS.md


Oh, thank you for the correction!


Yes, good point. Interestingly, our VSCode extension also works on VSCodium since we uploaded it to the OpenVSX registry!


Awesome! If it weren't for Microsoft's... basically monopoly on extensions, I'd be all over Codium.


The negative comments in this thread are coming from a mix of critics and assholes.

The difference between a critic and an asshole is whether they care about your success. You can safely ignore the assholes; critics provide the same information, but package it better.

I hope you and your team succeeds in creating a competitor in this space, it's going to get really exciting.


I think it is okay to still be figuring out the pricing and other details while you build this out, even if it is at a Show HN stage. Perhaps, I would not recommend using this on your sensitive projects just yet.

Given that most of the responses are about "what is the trick?", "how is it funded", etc., it seems like HN does not feel the same? If they were to respond with some random VC as the source of funding, would it suddenly be less nefarious (even though the incentives are probably worse)? What are the other implied obligations of doing a Show HN?


even if it is at a Show HN stage

'Show HN' is for things that can be tried and that's about it. It has different criteria (and inputs) from things like 'Launch HN'.

A Show HN needn't be complicated or look slick. The community is comfortable with work that's at an early stage.

https://hackertimes.com/showhn.html


Worth noting that only the extension is actually free (as in speech), the actual important parts, the server and model, are proprietary.


Your playground doesn't work well with Firefox on Android 13 (pixel 4a). The keyboard keeps getting hijacked.


Sorry to hear that. Point noted, we will take a look - most of our testing has been on chrome / safari.


Played around with it a little bit, but it always tries to write C code in TypeScript files.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: