ADF Snowflake V2 connector throws GenericAdoNetReadError during Copy activity

Raphael Fritz 0 Reputation points
2024-05-06T10:11:42.4733333+00:00

The problem

We're using a Copy activity in ADF to move data from a storage account into a Snowflake database using the V2 Snowflake connector. But just after the data has been copied, the Copy activity fails and throws the following error:

Operation on target Copy-legstatuses-to-load failed: ErrorCode=GenericAdoNetReadError,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Failed to execute the query command during read operation.,Source=Microsoft.DataTransfer.Connectors.GenericAdoNet,''Type=Apache.Arrow.Adbc.AdbcException,Message=[Snowflake] arrow/ipc: could not read message schema: arrow/ipc: could not read message metadata: unexpected EOF,Source=Apache.Arrow.Adbc,'

However, when we switch to the old (so-called deprecated) Snowflake connector, the same operation succeeds without any problems.

Some more details

The source container has several million files and we're using the last modified date to filter for updated files. Thus, depending on the interval we get between 10,000 and 50,000 files each consisting of one line with json syntax. We're copying the files using a prefix, so we don't specify each file but use one copy activity for all. Since there are so many files and the number is changing rapidly, we can't be sure if the activity is actually picking up all the files before it fails. But regardless of whether we reduce the time interval to a few hours of data or keep the full scope, the Copy activity fails – this might be after 10,000 rows of data, or 40,000.

Snowflake V2 connector broken?

So it seems to me that there is a bug in ADF's new Snowflake connector.

Yet another one, I might say:

Thus, I am getting increasingly frustrated with the introduction of that connector. Dear development team, can you please

  • help me out with this particular issue (maybe I'm doing something wrong with the configuration of the pipeline) but above all
  • fix those issues?

Thank you!

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,692 questions
{count} votes

1 answer

Sort by: Most helpful
  1. BhargavaGunnam-MSFT 27,156 Reputation points Microsoft Employee
    2024-05-08T23:13:30.1266667+00:00

    Hello Raphael Fritz

    from my internal team discussions, there is an underlying driver issue in new Snowflake connector with SHIR. The PG team is working on fix deployment and the ETA should be by mid May. Not sure(as you are using managed IR) if this is causing the issue you are seeing.

    Regaring your error code 27556: It appears that the Copy activity is getting the error, when a certain threshold is exceeded.

    Can you try explore splitting the data into smaller batches or increasing the number of parallel copy activities and see if it helps?

    0 comments No comments