Support Ollama #76

turtoes · 2024-07-29T03:44:24Z

Thanks for the amazing work, of course for a brand new beginner, it takes some times to setup the entire environment.
As readme already said the current best model to use is the gpt4o model. which majorities of us would understand. but in another hand, the openai and glaude models has been coded as predefiend config/formats. I wonder is there any plan to support ollama for the experiements of the offline LLM testing? since ollama can be called with the api as well. that should be possible and should be a standard way.

again, I am brandnew to both llm and agent, please correct me if I am wrong in any way. and hope ollama can be supported soon.!

once again, thanks for the great work!

dbgkmuc · 2024-07-29T15:38:50Z

I am trying porting Cradle to llama with mistral nemo 13B. so far the main issue is the LLM provider inside Cradle only support openai and claude. llama 's server api have most compatibility to openai , but Ollama is far from compatible. so if you go with ollama, it definitely cost you more time as you have to write a new LLM client.
I am not at the LLM 's planning level yet, but I doubt that mistral nemo is equivalent to openai 4, so most task's failure is expected.

sangeet0003 · 2024-07-31T23:12:26Z

LM studio uses same api structure as openai so is it possible to run via that and i have tried setting set OPENAI_API_BASE_URL="http://localhost:1234/v1" and i have also changed openai_config.json file to add a base url is there anything in code to change base url for openai

{
"key_var": "OA_OPENAI_KEY",
"base_var": "OA_BASE_URL",
"emb_model": "text-embedding-ada-002",
"comp_model": "gpt-4o",
"is_azure": false
}

DVampire · 2024-08-01T09:09:07Z

You can modify line 108 of the cradle/provider/llm/openai.py. OpenAI provides the base url parameter.

self.client` = OpenAI(api_key=key, base_url = ...).

class OpenAI(SyncAPIClient):
    completions: resources.Completions
    chat: resources.Chat
    edits: resources.Edits
    embeddings: resources.Embeddings
    files: resources.Files
    images: resources.Images
    audio: resources.Audio
    moderations: resources.Moderations
    models: resources.Models
    fine_tuning: resources.FineTuning
    fine_tunes: resources.FineTunes
    beta: resources.Beta
    with_raw_response: OpenAIWithRawResponse

    # client options
    api_key: str
    organization: str | None

    def __init__(
        self,
        *,
        api_key: str | None = None,
        organization: str | None = None,
        base_url: str | httpx.URL | None = None,
        timeout: Union[float, Timeout, None, NotGiven] = NOT_GIVEN,
        max_retries: int = DEFAULT_MAX_RETRIES,
        default_headers: Mapping[str, str] | None = None,
        default_query: Mapping[str, object] | None = None,
        # Configure a custom httpx client. See the [httpx documentation](https://www.python-httpx.org/api/#client) for more details.
        http_client: httpx.Client | None = None,
        # Enable or disable schema validation for data returned by the API.
        # When enabled an error APIResponseValidationError is raised
        # if the API responds with invalid data for the expected schema.
        #
        # This parameter may be removed or changed in the future.
        # If you rely on this feature, please open a GitHub issue
        # outlining your use-case to help us decide if it should be
        # part of our public interface in the future.
        _strict_response_validation: bool = False,
    ) -> None:

sangeet0003 · 2024-08-01T15:36:51Z

i have changed self.client = OpenAI(api_key="lm-studio", base_url ="http://localhost:1234/v1") in line 108 and now its redirecting to LMstudio and hre is the output from runner.py
python runner.py --envConfig "./conf/env_config_chrome.json" C:\Users\sa\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ..\aten\src\ATen\native\TensorShape.cpp:3527.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] final text_encoder_type: bert-base-uncased 2024-08-01 20:41:56,739 - UAC Logger - CPU: 0.0%, Memory: 93.6% - INFO - Loading skills from ./res/chrome/skills/skill_lib.json 2024-08-01 20:53:29,591 - openai._base_client - CPU: 0.0%, Memory: 86.6% - INFO - Retrying request to /embeddings in 0.751233 seconds
here is the log from lmstudio
[2024-08-01 20:43:29.588] [INFO] Received POST request to /v1/embeddings with body: { "input": [ [ 6014, 369, 459, 4652, 449, 2316, 8649, 330, 92459, 3331, 17863, 7935, 1, 323, 3665, 1202, 11612, 1052, 13 ] ], "model": "text-embedding-ada-002", "encoding_format": "base64" } [2024-08-01 20:53:30.395] [INFO] Received POST request to /v1/embeddings with body: { "input": [ [ 6014, 369, 459, 4652, 449, 2316, 8649, 330, 92459, 3331, 17863, 7935, 1, 323, 3665, 1202, 11612, 1052, 13 ] ], "model": "text-embedding-ada-002", "encoding_format": "base64" }
and where to attach the second code you have provided in the code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Ollama #76

Support Ollama #76

turtoes commented Jul 29, 2024

dbgkmuc commented Jul 29, 2024

sangeet0003 commented Jul 31, 2024

DVampire commented Aug 1, 2024

sangeet0003 commented Aug 1, 2024

Support Ollama #76

Support Ollama #76

Comments

turtoes commented Jul 29, 2024

dbgkmuc commented Jul 29, 2024

sangeet0003 commented Jul 31, 2024

DVampire commented Aug 1, 2024

sangeet0003 commented Aug 1, 2024