Transaction log – multi-statement table valued function

Someday I got interesting question in my job. Does SELECT query from multi-statement table valued function have impact on transaction log?

Because of result-set, there is a table variable declaration in multi-statement table valued function, we could expect the same behavior as described in post.

It would be interesting to make some tests, because I didn’t think about that simple select from function could affect transaction log of tempdb database. Results were surprising for me, follow reading this post, if you are insterested in. Just to remind, run bellow published scripts on testing environment only. I am using local installation of SQL on my laptop.

I would use similar approach as in post, In short.

  1. Clean up tempdb transaction log, and set the size of the log to the minimum
  2. Create multivalued function and select data from it
  3. Run simple select from the function
  4. In another query window run undocumented function sys.fn_dblog to get data from transaction log
  5. In another query window run allocation unit query extended to locking info (I will explain later)
  6. Check how tempdb transaction log file grew up

Firstly, set tempdb transaction log at the minimum size, to see if there is an impact on the transaction log. And call CHECKPOINT operation to clean-up transaction log. With the query bellow check list of allocation units with allocated pages in tempdb.

select request_type, request_mode,sa.total_pages,so.name, so.object_id, sp.index_id, sp.partition_id, sp.hobt_id, sa.allocation_unit_id, sa.type_desc
, l.*
FROM sys.objects so
JOIN sys.partitions sp on so.object_id = sp.object_id
JOIN sys.allocation_units sa on sa.container_id = sp.hobt_id
LEFT JOIN sys.dm_tran_locks l ON so.object_id =l.resource_associated_entity_id
WHERE total_pages >0

In my environment the query did not return any temporary tables with allocated pages.  So we can start test.

Run script bellow to create testing function.

CREATE FUNCTION [dbo].[fn_GetData]()
RETURNS
@TableTest TABLE (id INT, testdata INT
)
AS
BEGIN
 INSERT INTO @TableTest
SELECT a.object_id
     , a.object_id
FROM sys.objects a
JOIN sys.objects b ON 1=1
JOIN sys.objects c ON 1=1
JOIN sys.objects d ON 1=1
JOIN sys.objects e ON 1=1
RETURN
END
GO

Query to get data from created function.

SELECT * FROM dbo.fn_GetData() ;

In another SQL query window run query returning allocation units, listed above. After querying allocation units from tempdb we can see that there is  temporary table with allocated pages. Run query few times to see that count of allocated pages is increasing. See picture bellow.

Allocation units
Picture 01 Allocation units

Lets check tempdb transaction log file size. We can see that it was increased rapidly.

Transaction log file size
Picture 02 Transaction log file size

It is interesting that even when using simple SELECT from multi-statement table valued function, the tempdb transaction log can be affected. The Table variable declared for result-set inside the function is physically placed in tempdb. It is very similar behavior as in the post.

From sys.fn_dblog function we can see operations on Allocation unit (temporary table) we got from queries above.

Transaction log
Picture03 Transaction log

This time I had problem with DBCC PAGE to check inserted data. I was not able to access data as in post. That was reason why I extended script querying allocation units to locking info, where you can see that temporary table has exclusive lock, so other processes cannot read data from it. In my previous post where I tested impact of  Table variable on transaction log, there was BU (bulk load) lock mode set, so I was able to access data from sys.fn_dblog function.

Conclusion. Not only Table variable could impact transaction log, but multi-statement table valued function can affect it too. It is very interesting, and here raises another reason why you should be careful using these SQL features with respect to query performance. Both object types are physically created in tempdb database, not in memory. I found one difference between the table valued function and Table variable and it is lock escalation on these objects. While in case of Table variable, there was created temporary table in tempdb database with BU request mode, temporary table created by querying the multi-statement function was locked by X (exclusive lock). There would be probably more differrencies but maybe next time, and in another post.


VARCHAR/NVARCHAR sizing/oversizing

It was not so long ago, we discusses with my colleagues what is the best VARCHAR/NVARCHAR sizing since these string data types allocate space in pages based on char-count stored in dynamically.

I remember that there is a saying that you should size our VARCHAR/NVARCHAR columns with size you really need. But from first sentence of this post and MSDN definition it seems that it does not matter. These data types could adapt on string you put in. Let’s have look on its little bit closer. This could not be answered by an easy way.

Create two tables with columns, type of VARCHAR and different size for each table. In my sample I created one with VARCHAR(20) and another one with VARCHAR(1000). (Do not use VARCHAR(MAX) since it is another story, I will describe in one of my next POSTs).

CREATE TABLE _varchar_20( id VARCHAR(20))
CREATE TABLE _varchar_1000( id VARCHAR(1000) )

Fill the tables with data with text size corresponding to the table column with the lover VARCHAR size. In my case 20, it means that the second table column will be oversized.

INSERT INTO _varchar_20
SELECT REPLICATE('1',20)
FROM sys.objects a
JOIN sys.objects b ON 1=1
JOIN sys.objects c ON 1=1

INSERT INTO _varchar_1000
SELECT REPLICATE('1',20)
FROM sys.objects a
JOIN sys.objects b ON 1=1
JOIN sys.objects c ON 1=1

Let’s check how tables differs from the storage point of view.

SELECT so.name, 
so.object_id,        sp.index_id,        sp.partition_id,        sp.hobt_id,        sa.allocation_unit_id,        sa.type_desc, sa.total_pages FROM sys.objects so JOIN sys.partitions sp on so.object_id = sp.object_id JOIN sys.allocation_units sa on sa.container_id = sp.hobt_id WHERE so.name IN ('_varchar_20','_varchar_1000')

As we can see there is no different. Number of allocated pages is the same for all tables.

Pages count
Picture 01 – Pages count

Now run simple select queries and compare execution plans to check if there is no impact on query execution.

SELECT * FROM     _varchar_20
SELECT * FROM _varchar_1000

It seems that both execution plans are the same at first look.

Simmple query plan comparation
Picture 02 – Simmple query plan comparation

The only one counter differs – Estimated row size but it has evidently no impact on query execution.

 Estimated Row Size
Picture 03 – Estimated Row Size

Estimated Row Size

04 – Estimated Row Size

Now we could say that sizing of VARCHAR/NVARCHAR has no impact on storage and query execution. BUT let’s modify our queries with sort operators and run them again.

SELECT * FROM     _varchar_20 ORDER BY id

SELECT * FROM _varchar_1000 ORDER BY id

As you can see query getting data from smaller column VARCHAR/NVARCHAR sizing run with less query costs and pefroms much better. What happened?

query plan comparation
Picture 05 query plan comparation


Click on SELECT operator to see its properties for both queries. As you can see there appeared row with MEMORY GRANT meaning that query asked for memory reservation based on Estimated row count as I mentioned above. Sometimes optimizer does not look at the data really stored in objects and checks statistics or catalog schema info, etc. as in this case.

Picture 06 – memory grant

 memory grant

Picture 07 – memory grant

So we could see that little change in query caused different query plans and costs estimation with better performance for smaller VARCHAR/NVARCHAR sizing. 

My recommendation would states that it is better to size VARCHAR/NVARCHAR without oversizing columns when it is not really necessary.  Of course there could be scenarios that you expect that data could increase in time. But to increase length of your VARCHAR/NVARCHAR column is still easier job that if you have to reduce it. 

It would be interesting to take a look at this theme from more perspectives. I will continue with this topic in my next posts where I extend this theme to indexes, predicates, etc.  Stay tuned!

 

Table variable myths

There are lots of myths regarding Table variable . You can find lots of theories that Table variable has no impact on transaction log since it is out of scope of transaction. You can find   lots of articles that Table variable is stored in memory too. I decided to do some tests to see if Table variable could have impact on transaction log and if it is physically created in tempdb .

Important note at the beginning. Try bellow mentioned queries on your test environment only! Do not run it on production. I used SQL server 2017 installed on my laptop locally. It is better that I could eliminate possible impact of other processes running on SQL Server.

In my sample I created simple Table variable filling with lots of data in while cycle. In another query window I will use undocumented sys.fn_dblog function to read what is happening in transaction log.

I sized transaction log file of temporary table at very low value – 4MB. We will see if insert to Table variable can increase its file size.

Log size
Picture 01 Log size

Let’s clean up transaction log of temporary table first. Look at script where sys.fn_dblog function is called, to see how the transaction log looks like. There are only three records returned.

CHECKPOINT 
SELECT  * FROM sys.fn_dblog(NULL, NULL)

empty tempdb transaction log
Piture 02 empty tempdb transaction log

Execute script with insert rows to Table variable and do it in neverending while cycle, like I do. The main issue was to have query still running while getting data from log. Since the query was stopped, I was not able to get needed data from the log function. So it is important that the query with table variable you would like to analyze will be still running while getting data from transaction log in another query window.

SELECT [Transaction Name], [spid], [Xact ID], f.[Page ID], f.[parent transaction id], [Transaction ID],[Transaction Name] ,AllocUnitName,*
FROM sys.fn_dblog(NULL,NULL) f WHERE [SPID]=56

Get SPID of inserting query and run above mentioned query in new query window with SPID you get.  In output you can see name of Temporary table we created as name of transaction. We are lookingn for Transaction Name wiht AllocPages.

 Get log data based on parent transaction
Picture 04 Get log data based on parent transaction

Take Parent Transaction ID from the row where column Transaction Name = AllocPages and change predicate to select records base on Transaction ID

SELECT [Transaction Name], [spid], [Xact ID], f.[Page ID], f.[parent transaction id], [Transaction ID],[Transaction Name] ,AllocUnitName,* FROM sys.fn_dblog(NULL,NULL) f WHERE [Transaction ID]='0000:00002622'

Here we can see temporary table name in Allocation unit name column. The name of Allocation unit uses the same convention like for local temporary table. It really seems that  Table variable is physically created in tempdb as Temporary table.

Result from log filtered by transaction ID
Picture 05 Result from log filtered by transaction ID

Another view could be made with Allocation units, where you can check how many pages were used in our transaction in Table variable.

SELECT so.name, so.object_id, sp.index_id, sp.partition_id, sp.hobt_id, sa.allocation_unit_id, sa.type_desc, sa.total_pages
FROM sys.objects so
JOIN sys.partitions sp on so.object_id = sp.object_id
JOIN sys.allocation_units sa on sa.container_id = sp.hobt_id

Get data from allocation units and partitions
Picture 06 Get data from allocation units and partitions

Now we verify that the temporary table is connected to our Table variable and if so, that the data are stored in tempdb. We check that data inserting to Table variable can be found in pages of tempdb data files. I took first Page ID from row with Operation of LOP_MODIFY_ROW type.  It is highlighted on picture 05 – 0005:0000bd90. First number 0005 corresponds to tempdb file ID, the second number converted from hex to dec 48528 is page ID.  Use DBCC command  bellow to get SGAM page to get info where pages with data are placed.

DBCC PAGE(2,5, 48528 , 3) WITH TABLERESULTS;

We put 2 as first parameter meaning tempdb database ID. The second  parameter 5 is database file ID, 48528 number of page, and last parameter 3 output style.  Bellow we get list of ranges where pages are allocated.

List allocated pages
Picture 07 List allocated pages

Let’s choose one page from above listed allocated pages range- highlighted. I choose page 48537 using DBCC command again.

DBCC PAGE(2,5, 48537 , 3) WITH TABLERESULTS;

Look at details from DBCC output bellow. At Field column we can find id, testdata column name defined in our Table variable. In VALUES column get data already inserted.

page detail
Picture 08 page detail

We verified that Temporary table dbo.#BC836344 is Table variable we declared in our testing queryNow look at transaction log size. We see that tempdb log file size  increased.

Transaction log size
Picture 09 Transaction log size

When stopped the query inserting data to table variable we can see that the temporary table  disappeared from Transaction log.

Transaction log
Picture 10 Transaction log

Finally we checked that

  • Table variable is actually temporary table created in  tempdb , persisted during query run.
  • we could get inserted data by accessing pages from tempdb
  • DML operation on Table variable have impact on transaction log of tempdb
  • It can even cause unexpected increase of transaction log size

What is not still clear to me, or maybe I dont see it, why it is implemented this way. Table variable is defined like out of transaction scope table by Microsoft. Why there is a need to write data to transaction log, it seems useless to me. 

With this post I proved that Table variable is actually Temporary table, created in tempdb with some specific behavior. Next time could be insteresting to compare above mentioned sample with local Temporary table to see the differences in transaction log and pages allocation. Stay tuned.