ollama-js icon indicating copy to clipboard operation
ollama-js copied to clipboard

Ollama / Llava should be able to take in Base64 format w/ prefix when prompting with image

Open ykhli opened this issue 3 months ago • 0 comments

I have a function to convert an imageURL (s3) to Base64 like below

export default async function toBase64ImageUrl(
  imgUrl: string
): Promise<string> {
  const fetchImageUrl = await fetch(imgUrl);
  const responseArrBuffer = await fetchImageUrl.arrayBuffer();

  const toBase64 = `data:${fetchImageUrl.headers.get("Content-Type") || "image/png"};base64,${Buffer.from(responseArrBuffer).toString("base64")}`;
  return toBase64;

I found that Ollama throws the following error if I pass in an image with the above format. But it works if I just pass in the base64 string ("${Buffer.from(responseArrBuffer).toString("base64")}"). Although I could decode the Base64 format with "data:image/jpeg;base64," prefix successfully in both OAI and online base64 image decoders

 ⨯ ResponseError: illegal base64 data at input byte 4
    at checkOk (webpack-internal:///(rsc)/../../node_modules/ollama/dist/shared/ollama.be8fd0da.mjs:42:15)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async post (webpack-internal:///(rsc)/../../node_modules/ollama/dist/shared/ollama.be8fd0da.mjs:88:5)
    at async Ollama.processStreamableRequest (webpack-internal:///(rsc)/../../node_modules/ollama/dist/shared/ollama.be8fd0da.mjs:189:26)
    at async describeImageForVideo (webpack-internal:///(rsc)/./src/utils/video.ts:225:26)
    at async POST (webpack-internal:///(rsc)/./src/app/api/describe/route.ts:11:24)
    at async /Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/compiled/next-server/app-route.runtime.dev.js:6:63809
    at async eU.execute (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/compiled/next-server/app-route.runtime.dev.js:6:53964)
    at async eU.handle (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/compiled/next-server/app-route.runtime.dev.js:6:65062)
    at async doRender (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/base-server.js:1333:42)
    at async cacheEntry.responseCache.get.routeKind (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/base-server.js:1555:28)
    at async DevServer.renderToResponseWithComponentsImpl (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/base-server.js:1463:28)
    at async DevServer.renderPageComponent (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/base-server.js:1856:24)
    at async DevServer.renderToResponseImpl (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/base-server.js:1894:32)
    at async DevServer.pipeImpl (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/base-server.js:911:25)
    at async NextNodeServer.handleCatchallRenderRequest (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/next-server.js:271:17)
    at async DevServer.handleRequestImpl (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/base-server.js:807:17)
    at async /Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/dev/next-dev-server.js:331:20
    at async Span.traceAsyncFn (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/trace/trace.js:151:20)
    at async DevServer.handleRequest (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/dev/next-dev-server.js:328:24)
    at async invokeRender (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/lib/router-server.js:163:21)
    at async handleRequest (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/lib/router-server.js:342:24)
    at async requestHandlerImpl (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/lib/router-server.js:366:13)
    at async Server.requestListener (/Users/yli/code/multi-modal-starter-kit/webapp/node_modules/next/dist/server/lib/start-server.js:140:13) {
  error: 'illegal base64 data at input byte 4',
  status_code: 400

Also would be amazing if devs can just pass in an imageURL without needing to do Base64 conversion themselves!

ykhli avatar Mar 19 '24 12:03 ykhli