Replication/Mirroring - Anyone got this working?

We have replication from an Azure Sql Server to a Mirrored item. The replication brings in a new parquet file with a few rows as the source db updates (it's a call centre transactional db). It means we have thousands of parquet files being created. It's not possible to run the optimisation command because of permission to edit in the location the mirror uses (the mirror's own Azure blob storage). I believe this fragmented files are putting large strain on the meta data update of the lakehouses which shortcut to this data. I can't be certain but I think this may be causing the delays and lockups of the schema shortcuts downstream that we are seeing.

I'm very interested in how others are implementing replication and how it's working with the downstream dataflow.

4 comments