Dexter Wagang

Learn Microsoft Fabric

Activity

Mon

Wed

Fri

Sun

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

What is this?

Less

Memberships

Learn Microsoft Fabric

12.8k members • Free

2 contributions to Learn Microsoft Fabric

Dexter Wagang

May 22 •

Technical

Efficient Fabric storage

What is the most efficient way of storing data in the lakehouse in consideration of future rerunning the process and recovery from point of failure? Ingest data --> write as parquet format (back-up file)--> write to a delta table OR ingest data --> write directly as delta table?

New comment May 22

Dexter Wagang

0 likes • May 22

@Sambhav R , Thanks for your input. I believe we may need to keep history for at least a month. So that means safer to write it implicitly to a parquet format still right?

Dexter Wagang

Apr 24 •

Technical

Stored Proc vs. PySpark Notebook

My company is transitioning to Fabric from Azure synapse. Currently, our data team is debating about using Pyspark notebook or SQL Stored procs for our transformation of data. Our ultimate goal is to be platform agnostic and be more future proof. Meaning in the event that we would like to switch to other data platforms lng AWS or GCP or even snowflake, the transition would be much easier like a lift and shift migration. What would you recommend?

New comment May 8

Dexter Wagang

1 like • Apr 25

@Piotr Prussak Thanks for your input. Ahhh right, I did not consider the consideration for external API or even sFTP integration. Good point on that. Agree that SQL is a bit generic and if we end up using SQL it should be standardized to use a SQL version that would be more generically supported so I guess stick with ANSI-SQL coding standards to be cross-platform agnostic?

Dexter Wagang

0 likes • Apr 29

@Lori Keller , Thank you for sharing your actual experience around this decision point. I appreciate it. I share the same sentiment of trying to learn something new with notebooks and all the capabilities it brings to the table especially with data transformation processes given there is some wiggle room with time. The team is now leaning towards pyspark notebooks + SQL to leverage the power and efficiency of notebooks but at the same time not totally re-writing established SQL transformations that were already written so as not to take more time in transforming it into python code. While tinkering around Fabric workspaces, I just noticed it would take around 2.5 minutes to spool a notebook when you just want a simple DML statement in a table but once spark cluster is up, processes are executed instantaneously.

1-2 of 2

Level 2

12points to level up

Dexter Wagang

@dexter-wagang-5380

Data Architect

Active 126d ago

Joined Apr 21, 2025

Contributions

Followers

Following