Issue involves Google Search Console interpreting JSON paths as links, requiring careful handling of `__NEXT_DATA__`.
The issue involves Google Search Console interpreting paths within `__NEXT_DATA__` as internal links, leading to incorrect crawling behavior. A potential fix might involve escaping forward slashes in the JSON data, but this requires careful implementation to avoid breaking existing functionality. The scope is somewhat clear, but the solution needs to be tested thoroughly to ensure it doesn't introduce new issues.
Operating System:
Platform: linux
Arch: x64
Version: #1 SMP Wed Feb 19 06:37:35 UTC 2020
Binaries:
Node: 12.17.0
npm: 6.14.4
Yarn: 1.22.10
pnpm: N/A
Relevant packages:
next: 12.2.0
eslint-config-next: N/A
react: 17.0.2
react-dom: 17.0.2
N/A
N/A
I recognize that this might not even be something NextJS can "fix" but I just want to highlight a problem we're seeing with our NextJS sites. I know there are people from Google who actively support NextJS, and they don't expose their own services to issues / feature requests, so I thought I might as well try here.
tl;dr anything that even remotely resembles a path, returned as part of __NEXT_DATA__, will be picked up by Google Search Console and used as part of its crawling scope.
The simplest __NEXT_DATA__ is something that looks like
{ ... "page":"/docs/[[...slug]]" ...}
As seen here: https://nextjs.org/docs/basic-features/data-fetching/get-static-props.
During its crawls, Google matches within __NEXT_DATA__, seemingly anything that starts with a slash, and interprets them as internal links. In the Search Console, this manifests as a non-resolvable page, e.g. https://nextjs.org/docs/[[...slug], with a referrer of https://nextjs.org/docs/basic-features/data-fetching/get-static-props (and any other page under /docs).
Note that in the case of this NextJS page, it might not manifest as I've described as /docs is a valid page. For my particular use case, using a custom server, we're seeing the issue come up as https://blog.com/post with referrers of blog posts (e.g. https://blog.com/the-post-slug) -- /post is our _
Claim this issue to let others know you're working on it. You'll earn 10 points when you complete it!