Spaces:
Running
[MODELS] Discussion
what are limits of using these? how many api calls can i send them per month?
How can I know which model am using
Out of all these models, Gemma, which was recently released, has the newest information about .NET. However, I don't know which one has the most accurate answers regarding coding
Gemma seems really biased. With web search on, it says that it doesn't have access to recent information asking it almost anything about recent events. But when I ask it about recent events with Google, I get responses with the recent events.
apparently gemma cannot code?
Gemma is just like Google's Gemini series models, it have a very strong moral limit put on, any operation that may related to file operation, access that might be deep, would be censored and refused to reply.
So even there are solution for such things in its training data, it will just be filtered and ignored.
But still didn't test the coding accuracy that doesn't related to these kind of "dangerous" operations
@gmanskibidi hey you brought this up quite a lot already, we know about this, no need to keep bringing it up :) Like I said before, we're working on R1 but hosting a 658B model at scale is not the same as a 32B so it takes time.
Regarding the web search query, I'm looking into it, this is an issue with our task model and not related to the model you use in chat (unless you're using a tool enabled model since those generate their own queries)
@gmanskibidi hey you brought this up quite a lot already, we know about this, no need to keep bringing it up :) Like I said before, we're working on R1 but hosting a 658B model at scale is not the same as a 32B so it takes time.
Regarding the web search query, I'm looking into it, this is an issue with our task model and not related to the model you use in chat (unless you're using a tool enabled model since those generate their own queries)
ok, so then it would be a lot much much better for you to fix the web search issue in qwen and deepseek models inside chat-ui? we have been plagued by this problem for months now
could you try it now ? should be better
@acharyaaditya26 we're working on it! having some issues inference side but we'll be releasing as soon as that's fixed :)
thank you @nsarrazin if there is anything i can help with please do tell, i've some experience with hosting transformer based model in constrained environments.
@acharyaaditya26 we're working on it! having some issues inference side but we'll be releasing as soon as that's fixed :)
what about deepseek r1 replacing the old r1-distill-32b on chat-ui, since the latter suffers a ton of problems such as hallucinating and generating random sentences or phrases in plain sentences using other languages or constantly failing to search for the right query and frequently misinterprets the user search request?
I don't think full-fledge R1 is good idea, i mean it will hog up alot of GPU space which can be used to host specialized small models.
I'm seeing these "<|im_end|>" on meta-llama/Llama-3.3-70B-Instruct occasionally now.
could you try it now ? should be better
TYSM! anyway we just managed to find the latest linux mint and ubuntu version after your fix.
btw, would it be better if you can replace nvidia's nemotron model with google's gemma3-27b and replace microsoft's phi3.5 with phi-4-instruct?
and please don't hesitate to include support community tools on all newly updated models (including deepseek r1-distilled-qwen-32b, qwq-32b, and possibly phi-4 and even gemma-3-27b,..... as well).