prisma icon indicating copy to clipboard operation
prisma copied to clipboard

Using Prisma at the edge seems to cause React Server Components to randomly return 500s during prefetching. (probably Vercel Middleware problem with `next-auth`)

Open gablabelle opened this issue 1 year ago • 24 comments

Bug description

Hello,

I've setup a prisma data proxy at https://cloud.prisma.io/ and I'm now using prisma at the edge in my next.js. I use the middleware and next-auth to protect my pages, routes, etc.

Problem is that, when deployed, on every page refresh a few React Server Components are supposed to be prefetched but some of them will return an error 500. Error is INTERNAL_EDGE_FUNCTION_UNHANDLED_ERROR.

If I remove the prisma adapter from next-auth it works fine.

How to reproduce

  1. Ensure to test with a Next.js app that has a few links (using the Link component) pointing to a few pages.
  2. Create and configure your Data proxy at https://cloud.prisma.io/
  3. Generate and use the prisma client at the edge
    import { PrismaClient } from "@prisma/client/edge";
    
    declare global {
      // eslint-disable-next-line no-var
      var cachedPrisma: PrismaClient;
    }
    
    let prisma: PrismaClient;
    
    if (process.env.NODE_ENV === "production") {
      prisma = new PrismaClient();
    } else {
      if (!global.cachedPrisma) {
        global.cachedPrisma = new PrismaClient();
      }
      prisma = global.cachedPrisma;
    }
    
    export const db = prisma;
    export { type PrismaClient };
    
  4. Use the next-auth prisma adapter to use the "database" session strategy for your next-auth setup.
    import NextAuth, { type DefaultSession } from 'next-auth'
    import GitHub from 'next-auth/providers/github'
    import { PrismaAdapter } from "@next-auth/prisma-adapter";
    import { db } from "./db";
    
    declare module 'next-auth' {
      interface Session {
        user: {
          /** The user's id. */
          id: string
        } & DefaultSession['user']
      }
    }
    
    export const {
      handlers: { GET, POST },
      auth,
      CSRF_experimental // will be removed in future
    } = NextAuth({
      adapter: PrismaAdapter(db),
      providers: [GitHub],
      callbacks: {
        jwt({ token, profile }) {
          if (profile) {
            token.id = profile.id
            token.image = profile.picture
          }
          return token
        },
        authorized({ auth }) {
          return !!auth?.user // this ensures there is a logged in user for -every- request
        }
      },
      pages: {
        signIn: '/sign-in' // overrides the next-auth default signin page https://authjs.dev/guides/basics/pages
      }
    })
    
  5. Ensure that your pages are "protected" via the next.js middleware.
    export { auth as middleware } from './auth'
    
    export const config = {
      matcher: ['/((?!api|_next/static|_next/image|favicon.ico).*)']
    }
    
  6. Go to your deployed site and observe the RSC randomly throwing 500s on each page refresh. The problem doesn't happen during local development.
  7. Switch back to JWT session strategy by commenting out the adapter to not use it.
  8. Redeploy and see how everything is now working fine.

Expected behavior

React Server Components should be prefetched without any problems

Prisma information

// Add your schema.prisma
generator client {
  provider        = "prisma-client-js"
  previewFeatures = ["postgresqlExtensions"]
}

datasource db {
  provider          = "postgresql"
  url               = env("DATABASE_URL")
  directUrl         = env("DIRECT_DATABASE_URL")
  shadowDatabaseUrl = env("SHADOW_DATABASE_URL")
  extensions        = [vector]
}

model Account {
  id                String   @id @default(cuid())
  createdAt         DateTime @default(now())
  updatedAt         DateTime @default(now()) @updatedAt
  userId            String
  type              String
  provider          String
  providerAccountId String
  refresh_token     String?  @db.Text
  access_token      String?  @db.Text
  expires_at        Int?
  ext_expires_in    Int?
  token_type        String?
  scope             String?
  id_token          String?  @db.Text
  session_state     String?

  user User @relation(fields: [userId], references: [id], onDelete: Cascade)

  @@unique([provider, providerAccountId])
}

model Session {
  id           String   @id @default(cuid())
  sessionToken String   @unique
  userId       String
  expires      DateTime
  user         User     @relation(fields: [userId], references: [id], onDelete: Cascade)
}

model User {
  id               String          @id @default(cuid())
  createdAt        DateTime        @default(now())
  updatedAt        DateTime        @default(now()) @updatedAt
  name             String?
  email            String?         @unique
  username         String?         @unique
  emailVerified    DateTime?
  image            String?
  phone            String?
  language         String?
  country          String?
  accounts         Account[]
  sessions         Session[]
  posts            Post[]
  comments         Comment[]
  projects         ProjectUser[]
  workspaces       WorkspaceUser[]
  subscription     Subscription[]
  stripeCustomerId String?         @unique
  theme            Theme?          @relation(fields: [themeId, themeColor], references: [id, color])
  themeId          String?
  themeColor       String?
}

model VectorDocument {
  id        String   @id @default(cuid())
  createdAt DateTime @default(now())
  updatedAt DateTime @default(now()) @updatedAt
  public    Boolean

  embedding Unsupported("vector(1536)")?
}

model VerificationToken {
  identifier String
  token      String   @unique
  expires    DateTime

  @@unique([identifier, token])
}

model Post {
  id         String    @id @default(cuid())
  createdAt  DateTime  @default(now())
  updatedAt  DateTime  @default(now()) @updatedAt
  published  Boolean   @default(false)
  title      String
  slug       String    @unique
  image      String?
  excerpt    String?   @db.Text
  content    String?   @db.Text
  comments   Comment[]
  author     User      @relation(fields: [authorId], references: [id], onDelete: Cascade)
  authorId   String
  project    Project   @relation(fields: [projectId], references: [id], onDelete: Cascade)
  projectId  String
  category   Category  @relation(fields: [categoryId], references: [id])
  categoryId String
  // views      Int       @default(0)
  // likes      Int       @default(0)
  // shares     Int       @default(0)
}

model Comment {
  id        String   @id @default(cuid())
  createdAt DateTime @default(now())
  updatedAt DateTime @default(now()) @updatedAt
  content   String   @db.Text
  author    User     @relation(fields: [authorId], references: [id])
  authorId  String
  post      Post     @relation(fields: [postId], references: [id], onDelete: Cascade)
  postId    String
}

model Category {
  id    String @id @default(cuid())
  slug  String @unique
  name  String
  posts Post[]
}

model ProjectUser {
  user      User    @relation(fields: [userId], references: [id], onDelete: Cascade)
  userId    String
  project   Project @relation(fields: [projectId], references: [id], onDelete: Cascade)
  projectId String

  @@id([userId, projectId])
}

model Project {
  id          String        @id @default(cuid())
  createdAt   DateTime      @default(now())
  updatedAt   DateTime      @default(now()) @updatedAt
  users       ProjectUser[]
  posts       Post[]
  name        String?
  description String?       @db.Text
  slug        String?       @unique
  workspace   Workspace     @relation(fields: [workspaceId], references: [id], onDelete: Cascade)
  workspaceId String
}

enum WorkspaceRole {
  USER
  MANAGER
  ADMIN
}

model WorkspaceUser {
  user          User          @relation(fields: [userId], references: [id], onDelete: Cascade)
  userId        String
  workspace     Workspace     @relation(fields: [workspaceId], references: [id], onDelete: Cascade)
  workspaceId   String
  workspaceRole WorkspaceRole @default(USER)

  @@id([userId, workspaceId])
}

model Workspace {
  id             String          @id @default(cuid())
  createdAt      DateTime        @default(now())
  updatedAt      DateTime        @default(now()) @updatedAt
  projects       Project[]
  users          WorkspaceUser[]
  name           String?
  description    String?         @db.Text
  slug           String?         @unique
  subscription   Subscription?   @relation(fields: [subscriptionId], references: [id])
  subscriptionId String?
}

model Subscription {
  id                     String      @id @default(cuid())
  createdAt              DateTime    @default(now())
  updatedAt              DateTime    @default(now()) @updatedAt
  user                   User        @relation(fields: [stripeCustomerId], references: [stripeCustomerId])
  stripeCustomerId       String
  stripeSubscriptionId   String?     @unique
  stripePriceId          String?
  stripeCurrentPeriodEnd DateTime?
  workspaces             Workspace[]
}

model Theme {
  id    String @unique
  color String
  users User[]

  @@unique([id, color])
}

Environment & setup

  • OS: macOS
  • Database: PostgreSQL
  • Node.js version: node 18 (on Vercel)

Prisma Version

prisma                  : 5.1.0
@prisma/client          : 5.1.0
Current platform        : darwin-arm64
Query Engine (Node-API) : libquery-engine a9b7003df90aa623086e4d6f4e43c72468e6339b (at node_modules/.pnpm/registry.npmjs.org+@[email protected]/node_modules/@prisma/engines/libquery_engine-darwin-arm64.dylib.node)
Schema Engine           : schema-engine-cli a9b7003df90aa623086e4d6f4e43c72468e6339b (at node_modules/.pnpm/registry.npmjs.org+@[email protected]/node_modules/@prisma/engines/schema-engine-darwin-arm64)
Schema Wasm             : @prisma/prisma-schema-wasm 5.1.0-28.a9b7003df90aa623086e4d6f4e43c72468e6339b
Default Engines Hash    : a9b7003df90aa623086e4d6f4e43c72468e6339b
Studio                  : 0.492.0

gablabelle avatar Aug 05 '23 14:08 gablabelle

Thanks for the great description!

Do you get any information from the logs on what this INTERNAL_EDGE_FUNCTION_UNHANDLED_ERROR is about? Can you maybe create a minimal reproduction project and put it on GitHub so we can experience the problem ourselves? You can probably do it in a few minutes, while we would need... longer to figure things out. That would be awesome. Thanks.

janpio avatar Aug 05 '23 20:08 janpio

Hello @janpio,

I have setup the requested minimal reproduction project and it's available on Github here: https://github.com/gablabelle/prisma-edge-errors

I have also deployed that app to https://prisma-edge-errors.vercel.app/. After logging in you'll be able to observe the errors in the dev tools' Network tab. Upon each page refresh random pages will sometimes return 500s. Disable cache before refreshing the page.

Screenshot 2023-08-06 at 10 24 09

The error I get in the Vercel logs is the following for each page prefetching that failed with a 500:

[GET] [middleware: "src/middleware"] /dashboard reason=INTERNAL_EDGE_FUNCTION_UNHANDLED_ERROR, status=500, upstream_status=500, user_error=false

Screenshot 2023-08-06 at 10 25 05

gablabelle avatar Aug 06 '23 14:08 gablabelle

Thanks, playing with the reproduction right now.

I could not trigger an error 500 with https://prisma-edge-errors.vercel.app/ for now :/ Can you show the request in the devtools network panel without the request details and preview? I wonder what the exact sequence of unfiltered requests is. Is it really just these "preload" requests with _rsc?

I also deployed the project to our account, but can also not trigger the errors you describe there: https://prisma-edge-errors-ashen.vercel.app/ No matter how often I hit F5, I always get successful requests (or cancelled ones, as it seems a request was still in progress while I reloaded already)

(Aside: Are you using Vercel Postgres with this project? The shadowDatabaseUrl should not be necessary in that case. Is that still documented somewhere? If, we would like to know about it and clean that up.)

What kind of account are you using on Vercel? Any idea how I can reproduce the problem you are seeing?

janpio avatar Aug 06 '23 19:08 janpio

Tadaa, got it: image image

Unfortunately no idea what I did differently now :/

janpio avatar Aug 06 '23 19:08 janpio

Ok, I can reproduce this more reliably now: You really have to wait for all requests to finish. What I described as "cancelled ones, as it seems a request was still in progress while I reloaded already" would indeed have ended up as 500 after ~25 seconds.

That is also what I notice, all of these 500s happen slightly above 25 seconds. It almost looks like the function execution runs into some kind of timeout. Unfortunately, I do not know enough about Next.js or Vercel Middlewares to really debug this further. I reached out to Vercel, and my colleagues.

janpio avatar Aug 06 '23 20:08 janpio

@janpio I'm a bit reassured that you were able to reproduce the issue lol. When testing again a few minutes ago, I see that the error seems to be less frequent, but it still happens though.

FYI, I'm using a Neon serverless database. That's why I use the shadowDatabaseUrl.

Thanks for your quick responses, I'll be waiting for your update on this and I'll be using the JWT session strategy in the meantime.

gablabelle avatar Aug 06 '23 20:08 gablabelle

FYI, I'm using a Neon serverless database. That's why I use the shadowDatabaseUrl.

Good news: You can also remove it for Neon - they also work without shadowDatabaseUrl now: https://neon.tech/blog/prisma-dx-improvements#:~:text=Removing%20the%20need%20for%20manually%20creating%20the%20shadow%20database 👍

A few questions:

  • Did you configure a connection_limit when entering your Neon database connection string in Data Proxy?
  • Did you use the pooled or the unpooled connection string? If you used the pooled one, did it include the pgbouncer=true bit?
  • Can you maybe add a few more sub pages to your reproduction, so it preloads more URLs on the main page and see if that makes the problem more prominent?

janpio avatar Aug 06 '23 20:08 janpio

FYI, I'm using a Neon serverless database. That's why I use the shadowDatabaseUrl.

Good news: You can also remove it for Neon - they also work without shadowDatabaseUrl now:

Thanks for the heads up!

A few questions:

  • Did you configure a connection_limit when entering your Neon database connection string in Data Proxy?
  • Did you use the pooled or the unpooled connection string? If you used the pooled one, did it include the pgbouncer=true bit?

When specifying the connection string in Data Proxy, I have removed the -pooler suffix and removed the pgbouncer=true query param from the URL but I have kept the connect_timeout=10 in there.

  • Can you maybe add a few more sub pages to your reproduction, so it preloads more URLs on the main page and see if that makes the problem more prominent?

Done, deployed.

gablabelle avatar Aug 06 '23 20:08 gablabelle

Thanks. Looks to me this increased the probability to run into the 500.

Final question for today: What is your Data Proxy user- and project name? We'll want to look up what is going on in the communication with your database on our side. (Can't really be any of the Prisma side timeouts, those are not 25s by default...)

janpio avatar Aug 06 '23 21:08 janpio

Thanks. Looks to me this increased the probability to run into the 500.

Final question for today: What is your Data Proxy user- and project name? We'll want to look up what is going on in the communication with your database on our side. (Can't really be any of the Prisma side timeouts, those are not 25s by default...)

Data Proxy user: gablabelle Data Proxy project: prisma-edge-errors

gablabelle avatar Aug 06 '23 21:08 gablabelle

@janpio Hello, I'm simply wondering if this is something you guys plan to address at some point or I should forget about this for the time being? 😉

gablabelle avatar Aug 21 '23 20:08 gablabelle

Oh sorry, I didn't update you here: We could not find anything wrong on the Prisma Data Platform side. But we let Vercel and Next-Auth know and they are looking into it. No updates from that front yet though.

janpio avatar Aug 21 '23 23:08 janpio

Oh sorry, I didn't update you here:

We could not find anything wrong on the Prisma Data Platform side.

But we let Vercel and Netx-Auth know and they are looking into it. No updates from that front yet though.

Ok are the related issues public? I'd like to subscribe to the issues if possible to get updated too.

gablabelle avatar Aug 21 '23 23:08 gablabelle

I am hitting this exact issue with a similar setup. Using Supabase as the backing DB and auth0 as the auth provider.

Going to try disabling prefetch for now but hoping there is a better solution.

SeanMcGrath avatar Aug 23 '23 21:08 SeanMcGrath

Ok are the related issues public? I'd like to subscribe to the issues if possible to get updated too.

Unfortunately not, just a conversation in a shared Slack space.

I am hitting this exact issue with a similar setup. Using Supabase as the backing DB and auth0 as the auth provider.

That is actually super interesting and relevant, as it excludes Prisma Data Proxy from the equation. Does auth0 as the auth provider mean that you are also using next-auth, but use Auth0 and not Prisma? So it might not even be a Prisma related problem at all? Or did I misread that?

janpio avatar Aug 23 '23 21:08 janpio

Sorry, let me clarify:

We are not using next-auth at all. we are using auth0 and their nextjs SDK.

We have a middleware.ts edge function that reads the auth0 session data and compares that with data in prisma, fetched via Data Proxy.

the middleware matcher is as follows:

export const config = {
  // matches all non-public and non-API routes
  // API routes have their own auth
  matcher: [
    "/",
    "/communication/:path*",
    "/consumer/:path*",
    "/crm/:path*",
    "/documents/:path*",
    "/event/:path*",
    "/eventInstance/:path*",
    "/experimental/:path*",
    "/kiosk/:path*",
    "/memberships/:path*",
    "/membershipTypes/:path*",
    "/orgGroup/:path*",
    "/payments/:path*",
    "/people/:path*",
    "/products/:path*",
    "/programs/:path*",
    "/proposedClasses/:path*",
    "/reports/:path*",
    "/settings/:path*",
    "/templateDocuments/:path*",
  ],
};

my understanding is that this wouldn't match the __next routes that power prefetching but I'm pretty ignorant of how server components work, to be honest.

The result is, occasionally, a burst of prefetch requests that hang and then fail with 500s, with error codes that indicate an edge function failed - AFAIK the only edge function we use is the middleware.

SeanMcGrath avatar Aug 23 '23 21:08 SeanMcGrath

Ok, so still a database query involved - but in a pretty different stack. The failing 500s after ~25s would match the behavior that we are seeing in the other reproduction.

Can you maybe share the code you run in the middleware, or even better a minimal reproduction that we can deploy and run ourselves (together with Vercel) to figure out what might be going on? Thanks!

janpio avatar Aug 23 '23 21:08 janpio

Good morning @janpio ... No news? 😉

gablabelle avatar Oct 09 '23 12:10 gablabelle

No news unfortunately, Vercel is still investigating. Might be worth for you to update your project to more recent dependencies to confirm that the problem is indeed still happening. Especially next-auth was a manual, which is their "canary" or "dev" version I think. Maybe a released version fixes this already?

janpio avatar Oct 09 '23 19:10 janpio

Is this still happening for you recently @gablabelle?

janpio avatar Mar 18 '24 17:03 janpio

Is this still happening for you recently @gablabelle?

I've been using the JWT strategy since then and I'm unable to make the app work anymore, getting errors since I've updated the npm packages. :-(

PrismaClientKnownRequestError: 
Invalid `prisma.account.findUnique()` invocation:


Unknown server error: {"type":"UnknownTextError","body":"error code: 1016"}

gablabelle avatar Mar 18 '24 23:03 gablabelle

@janpio Took the time to fix the issue and it seems I still get the same errors

Screenshot 2024-03-29 at 09 17 16

gablabelle avatar Mar 29 '24 13:03 gablabelle

@gablabelle what if you add :path?_rsc=*) in your matcher

example: '/((?!api|_next/static|_next/image|favicon.ico|:path?_rsc=).)'

RobinGiel avatar May 11 '24 17:05 RobinGiel

@RobinGiel Thanks for the suggestion. Following your comment, I've tested, but I still get the same errors. Here is my middleware.ts file:

export { auth as middleware } from "@/lib/auth";

export const config = {
  matcher: ["/((?!api|_next/static|_next/image|favicon.ico|:path?_rsc=).*)"],
};

Screenshot 2024-05-11 at 13 15 33

gablabelle avatar May 11 '24 17:05 gablabelle

So what should we do here @gablabelle? Is this a Prisma problem? I can't really figure out how this could be caused by Prisma - and what we could do about it. Any ideas?

janpio avatar Jun 06 '24 11:06 janpio

So what should we do here @gablabelle? Is this a Prisma problem? I can't really figure out how this could be caused by Prisma - and what we could do about it. Any ideas?

@janpio What was the outcome when you reached out to the Vercel team about this?

gablabelle avatar Jun 11 '24 20:06 gablabelle

No news, unfortunately.

janpio avatar Jun 12 '24 16:06 janpio

I am observing something similar with next-auth V5 and Prisma Accelerate with my middleware auth checks. Everything works great in development but once deployed to Vercel, I will randomly get a 500 error every few requests coming from the middleware. My middleware functions are my only edge requests that are made.

[GET] [middleware: "src/middleware"] /en/app/settings reason=INTERNAL_EDGE_FUNCTION_INVOCATION_FAILED, status=500, upstream_status=500, user_error=false

I had initially added a comment to the open next-auth issue but also wanted to add my findings here if relevant to the investigation.

If I turn off Link prefetching so that the middleware execution is delayed until page click then I don't see this issue. I am not sure if this points to Prisma Client initialization when multiple requests to middleware are made at once?

package.json

  "@auth/prisma-adapter": "^2.4.1",
  "@prisma/client": "^5.16.1",
  "@prisma/extension-accelerate": "^1.1.0",
  "prisma": "^5.16.1",
  "next": "^14.2.4",
  "next-auth": "5.0.0-beta.19"

db.ts

import { PrismaClient } from "@prisma/client/edge";
import { withAccelerate } from '@prisma/extension-accelerate'

import { env } from "~/env";

const createPrismaClient = () =>
  new PrismaClient({
    log:
      env.NODE_ENV === "development" ? ["query", "error", "warn"] : ["error"],
  }).$extends(withAccelerate());

const globalForPrisma = globalThis as unknown as {
  prisma: ReturnType<typeof createPrismaClient> | undefined;
};

export const db = globalForPrisma.prisma ?? createPrismaClient();

if (env.NODE_ENV !== "production") globalForPrisma.prisma = db;

middleware.ts

import { auth } from "./server/auth";
import { NextResponse } from "next/server";

export default auth((req) => {
  //redirect to sign in page if not authenticated
  if (!req.auth && req.nextUrl.pathname !== "/auth/signin") {
    const newUrl = new URL("/auth/signin", req.nextUrl.origin)
    return Response.redirect(newUrl)
  }
});

export const config = { matcher: ["/app/:path*"] };

tycomo avatar Jul 01 '24 15:07 tycomo

I have been experiencing the same issue and figured out a solution https://github.com/nextauthjs/next-auth/issues/10773#issuecomment-2341579045

michaelcummings12 avatar Sep 10 '24 17:09 michaelcummings12