Parallelism and tempdb data file usage

by Daniel A G Jun 28, 2017

Configured an instance of SQL Server 2016 with four equally sized, small tempdb files. Then tested a set of queries that qualify for parallelism alternately using 4 processors, and running them single threaded.Observed that the queries always made all four files grow.

However, in that first version of the post, forgot that the default behavior in SQL Server 2016 is to grow all files in tempdb simultaneously when one grows. Basically, one small feature of SQL Server 2016 is that trace flag 1117 is always enabled by default for tempdb. Just because all the data files grew doesnâ€™t mean they all got used!

Can we just turn off TF 1117 for tempdb?

For most databases, you can control whether all the files in a filegroup grow at once by changing a setting on the filegroup.But not tempdb. If you run this command:

You get the error:

Letâ€™s use science. Otherwise known as Extended Events.

We didnâ€™t do this in the first place because it was a bunch of work. The devil is in the detail with stuff like this. A lot of events seem like theyâ€™d work, but then you try to use them and either you donâ€™t get the data youâ€™d expect, or the trace generates such a massive amount of data that itâ€™s hard to work with.

Used the sqlserver.file_read event and a histogram target

I love the histogram target because it doesnâ€™t generate a huge file of data to confuse myself with. I set up my new test this wayâ€¦

Extended Events trace for sql server.file_read

Collect file paths (just to make it easy to see the data file names)
filter on database_id=2 (tempdb)
filter on session_id = 57 (the session I happened to be testing under)
The histogram target was set to â€œbucketâ€ file_read by each file path

Set up a bunch of queries in a batch, and for each query:

Dumped everything out of memory (ran a checkpoint in tempdb and dbcc dropcleanbuffers)
Ran a short waitfor in case the checkpoint took a bit of time
Started the trace
Ran the query
Dumped the data from the histogram into a table
Stopped the trace (this clears the histogram)

Ran the whole thing a couple of times to make sure I got consistent results. And did!

Results: single threaded queries CAN use multiple tempdb data files

One of my queries does a large sort operation. Using maxdop 4 and maxdop 1, it still evenly used my tempdb data files.

Saw similar patterns with a query using a merge join, a query using a lazy spool, and a query with a sort that was engineered to be under-estimated (and has a memory spill).

Results: queries may not use tempdb files in the same ways

As you might guess, things may not always get evenly accessed, even if you have evenly sized tempdb files. One of my queries did a select into a temp table. Although it used all four tempdb files whether or not it went parallel, there were more file_read events against the first tempdb file than against the other four.

Could my methodology still have problems?

Itâ€™s possible. I havenâ€™t used the sqlserver.file_read event much, or tested it super thoroughly. It seemed to give me consistent results.So while itâ€™s possible that Iâ€™ll suddenly realize that thereâ€™s a better way to measure this.

Statistics Says It All

16

Years of Experience

3000

Gratified Students

100

Training Batches

9600

Training Hours

Please subscribe our technical blog to get recent updates.

Empire Data Systems

Parallelism and tempdb data file usage

Used the sqlserver.file_read event and a histogram target

Results: queries may not use tempdb files in the same ways

Could my methodology still have problems?

Statistics Says It All

16

3000

100

9600

Subscribe

Empire Data Systems

Parallelism and tempdb data file usage

Used the sqlserver.file_read event and a histogram target

Results: queries may not use tempdb files in the same ways

Could my methodology still have problems?

Statistics Says It All

16

3000

100

9600

Subscribe

Social Links