I’ll give an example. At my previous company there was a program where you basically select a start date, select an end date, select the system and press a button and it reaches out to a database and pulls all the data following that matches those parameters. The horrors of this were 1. The queries were hard coded.
-
They were stored in a configuration file, in xml format.
-
The queries were not 1 entry. It was 4, a start, the part between start date and end date, the part between end date and system and then the end part. All of these were then concatenated in the program intermixed with variables.
-
This was then sent to the server as pure sql, no orm.
-
Here’s my favorite part. You obviously don’t want anyone modifying the configuration file so they encrypted it. Now I know what you’re thinking at some point you probably will need to modify or add to the configuration so you store an unencrypted version in a secure location. Nope! The program had the ability to encrypt and decrypt but there were no visible buttons to access those functions. The program was written in winforms. You had to open the program in visual studio, manually expand the size of the window(locked size in regular use) and that shows the buttons. Now run the program in debug. Press the decrypt button. DO NOT EXIT THE PROGRAM! Edit the file in a text editor. Save file. Press the encrypt button. Copy the encrypted file to any other location on your computer. Close the program. Manually email the encrypted file to anybody using the file.


This might require a bit of background knowledge about Power Query in Excel and Power BI, specifically the concept Query Folding.
Power Query is a tool to define and run queries against a host of data sources and spit out tabular data for use in Excel (as tables) or Power BI (as Tabular Data Model). The selling point of it is the low-code graphical presentation: You transform the data by adding steps to the query, mostly through the menu ribbon. Change a column type? Click the column header > Data Type > select the new type. Perform a join? Click “Merge Queries”, select the second query, select the respective key column(s) to join on and thr join type – no typing needed. You get a nested table column you can then select which columns to expand or aggregate from.
Each step provides you with a preview of the results, and you can look at, edit, delete or insert earlier steps at will. You can also edit individual steps or the whole query through a code editor, but the appeal is obviously that even non-programmers can use it without needing to code.
Of course, it’s most efficient to have SQL transformations done by the database server already. Bur Power Query can do that too: “Query Folding” is the feature that automatically turns a sequence of Power Query steps into native SQL. A sequence like “Source, Select Columns, Filter Rows, Rename Columns” will quite neatly be converted into the SQL equivalent you’d expect. Merges will become Join, appending tables becomes Union, converting a text to uppercase becomes UPPER and so on.
If at some point there is a step it can’t fold, it will use a native query to load the data up to that point, then do the rest in-memory. Even if later steps were foldable, they’ll have to be done in-memory. You can guess that this creates a lot of potential for optimising longer queries by ensuring as much or it as possible is folded and that the result is as “small” as possible – as few rows and column as feasible etc.
Now, when I tell you that there is a table in one of our sources with a few large text columns you almost never need, you may be able to smell the smoke already. A colleague of mine needed help with his queries being slow to load. He had copied some code from Stackoverflow or what have you that joins a query with itself multiple times to resolve hierarchies. In theory, it was supposed to be foldable, provided the step it runs off of is. The general schema of my colleague’s query went Data Source -> non-foldable type conversion -> copied code -> filtering (ultimately keeping about 20% of rows) -> renaming columns -> removing columns. Want to guess which columns were loaded, processed with each join, explicitly renamed and only then finally understood to be useless and discarded?
“I always do the filtering last, don’t want to miss anything.”
This is your regularly scheduled reminder that MS (and our corporate BI team) can present Power Query as self-service data transformation tool all it wants, that still doesn’t mean it’s actually designed for use by non-data techies.