Transaction log – multi-statement table valued function

Someday I got interesting question in my job. Does SELECT query from multi-statement table valued function have impact on transaction log?

Because of result-set, there is a table variable declaration in multi-statement table valued function, we could expect the same behavior as described in post.

It would be interesting to make some tests, because I didn’t think about that simple select from function could affect transaction log of tempdb database. Results were surprising for me, follow reading this post, if you are insterested in. Just to remind, run bellow published scripts on testing environment only. I am using local installation of SQL on my laptop.

I would use similar approach as in post, In short.

  1. Clean up tempdb transaction log, and set the size of the log to the minimum
  2. Create multivalued function and select data from it
  3. Run simple select from the function
  4. In another query window run undocumented function sys.fn_dblog to get data from transaction log
  5. In another query window run allocation unit query extended to locking info (I will explain later)
  6. Check how tempdb transaction log file grew up

Firstly, set tempdb transaction log at the minimum size, to see if there is an impact on the transaction log. And call CHECKPOINT operation to clean-up transaction log. With the query bellow check list of allocation units with allocated pages in tempdb.

select request_type, request_mode,sa.total_pages,so.name, so.object_id, sp.index_id, sp.partition_id, sp.hobt_id, sa.allocation_unit_id, sa.type_desc
, l.*
FROM sys.objects so
JOIN sys.partitions sp on so.object_id = sp.object_id
JOIN sys.allocation_units sa on sa.container_id = sp.hobt_id
LEFT JOIN sys.dm_tran_locks l ON so.object_id =l.resource_associated_entity_id
WHERE total_pages >0

In my environment the query did not return any temporary tables with allocated pages.  So we can start test.

Run script bellow to create testing function.

CREATE FUNCTION [dbo].[fn_GetData]()
RETURNS
@TableTest TABLE (id INT, testdata INT
)
AS
BEGIN
 INSERT INTO @TableTest
SELECT a.object_id
     , a.object_id
FROM sys.objects a
JOIN sys.objects b ON 1=1
JOIN sys.objects c ON 1=1
JOIN sys.objects d ON 1=1
JOIN sys.objects e ON 1=1
RETURN
END
GO

Query to get data from created function.

SELECT * FROM dbo.fn_GetData() ;

In another SQL query window run query returning allocation units, listed above. After querying allocation units from tempdb we can see that there is  temporary table with allocated pages. Run query few times to see that count of allocated pages is increasing. See picture bellow.

Allocation units
Picture 01 Allocation units

Lets check tempdb transaction log file size. We can see that it was increased rapidly.

Transaction log file size
Picture 02 Transaction log file size

It is interesting that even when using simple SELECT from multi-statement table valued function, the tempdb transaction log can be affected. The Table variable declared for result-set inside the function is physically placed in tempdb. It is very similar behavior as in the post.

From sys.fn_dblog function we can see operations on Allocation unit (temporary table) we got from queries above.

Transaction log
Picture03 Transaction log

This time I had problem with DBCC PAGE to check inserted data. I was not able to access data as in post. That was reason why I extended script querying allocation units to locking info, where you can see that temporary table has exclusive lock, so other processes cannot read data from it. In my previous post where I tested impact of  Table variable on transaction log, there was BU (bulk load) lock mode set, so I was able to access data from sys.fn_dblog function.

Conclusion. Not only Table variable could impact transaction log, but multi-statement table valued function can affect it too. It is very interesting, and here raises another reason why you should be careful using these SQL features with respect to query performance. Both object types are physically created in tempdb database, not in memory. I found one difference between the table valued function and Table variable and it is lock escalation on these objects. While in case of Table variable, there was created temporary table in tempdb database with BU request mode, temporary table created by querying the multi-statement function was locked by X (exclusive lock). There would be probably more differrencies but maybe next time, and in another post.


VARCHAR/NVARCHAR sizing/oversizing

It was not so long ago, we discusses with my colleagues what is the best VARCHAR/NVARCHAR sizing since these string data types allocate space in pages based on char-count stored in dynamically.

I remember that there is a saying that you should size our VARCHAR/NVARCHAR columns with size you really need. But from first sentence of this post and MSDN definition it seems that it does not matter. These data types could adapt on string you put in. Let’s have look on its little bit closer. This could not be answered by an easy way.

Create two tables with columns, type of VARCHAR and different size for each table. In my sample I created one with VARCHAR(20) and another one with VARCHAR(1000). (Do not use VARCHAR(MAX) since it is another story, I will describe in one of my next POSTs).

CREATE TABLE _varchar_20( id VARCHAR(20))
CREATE TABLE _varchar_1000( id VARCHAR(1000) )

Fill the tables with data with text size corresponding to the table column with the lover VARCHAR size. In my case 20, it means that the second table column will be oversized.

INSERT INTO _varchar_20
SELECT REPLICATE('1',20)
FROM sys.objects a
JOIN sys.objects b ON 1=1
JOIN sys.objects c ON 1=1

INSERT INTO _varchar_1000
SELECT REPLICATE('1',20)
FROM sys.objects a
JOIN sys.objects b ON 1=1
JOIN sys.objects c ON 1=1

Let’s check how tables differs from the storage point of view.

SELECT so.name, 
so.object_id,        sp.index_id,        sp.partition_id,        sp.hobt_id,        sa.allocation_unit_id,        sa.type_desc, sa.total_pages FROM sys.objects so JOIN sys.partitions sp on so.object_id = sp.object_id JOIN sys.allocation_units sa on sa.container_id = sp.hobt_id WHERE so.name IN ('_varchar_20','_varchar_1000')

As we can see there is no different. Number of allocated pages is the same for all tables.

Pages count
Picture 01 – Pages count

Now run simple select queries and compare execution plans to check if there is no impact on query execution.

SELECT * FROM     _varchar_20
SELECT * FROM _varchar_1000

It seems that both execution plans are the same at first look.

Simmple query plan comparation
Picture 02 – Simmple query plan comparation

The only one counter differs – Estimated row size but it has evidently no impact on query execution.

 Estimated Row Size
Picture 03 – Estimated Row Size

Estimated Row Size

04 – Estimated Row Size

Now we could say that sizing of VARCHAR/NVARCHAR has no impact on storage and query execution. BUT let’s modify our queries with sort operators and run them again.

SELECT * FROM     _varchar_20 ORDER BY id

SELECT * FROM _varchar_1000 ORDER BY id

As you can see query getting data from smaller column VARCHAR/NVARCHAR sizing run with less query costs and pefroms much better. What happened?

query plan comparation
Picture 05 query plan comparation


Click on SELECT operator to see its properties for both queries. As you can see there appeared row with MEMORY GRANT meaning that query asked for memory reservation based on Estimated row count as I mentioned above. Sometimes optimizer does not look at the data really stored in objects and checks statistics or catalog schema info, etc. as in this case.

Picture 06 – memory grant

 memory grant

Picture 07 – memory grant

So we could see that little change in query caused different query plans and costs estimation with better performance for smaller VARCHAR/NVARCHAR sizing. 

My recommendation would states that it is better to size VARCHAR/NVARCHAR without oversizing columns when it is not really necessary.  Of course there could be scenarios that you expect that data could increase in time. But to increase length of your VARCHAR/NVARCHAR column is still easier job that if you have to reduce it. 

It would be interesting to take a look at this theme from more perspectives. I will continue with this topic in my next posts where I extend this theme to indexes, predicates, etc.  Stay tuned!

 

SQL Server Traces

I would like to target in next posts on comparing Extended Events with Profiler Traces. In this post we will look at basic T-SQL routines creating and handling SQL Traces. You can use SQL Profile tool too. Open Management Studio -> Top menu -> Tools -> SQL Server Profiler.

Just few briefly words about SQL Traces. It is used to track SQL Server events triggered in system. Such a tool can be used to monitor SQL Server deadlock, performance tuning, auditing security area, etc.

Deeper comparation with extended events will be mentioned in one of my next posts.

As extended events SQL Server trace have few options:

Data column is an attribute of an event that can be collected in trace. Not all attributes are available for all events. Each event has its own set of attributes. Usefull queries for traces and their binding to columns bellow.

SELECT * FROM sys.trace_columns
SELECT * FROM sys.trace_event_bindings

Event is an object that is triggered in system and tracked by trace. The event contains data columns that can be collected and reported in trace.

SELECT * FROM sys.trace_events
SELECT * FROM sys.trace_categories

Trace is actually a collection of events and data returned by the Database Engine. To get info about traces and their options use following query.

SELECT * FROM sys.traces

Trace filters are predicates limiting collected events in a trace. To get info of filters set to concrete trace use following function with ID of trace as parameter.

SELECT * FROM sys.fn_trace_getfilterinfo(2)

To collect trace data by T-SQL you have to do few steps.

  1. create trace – to get its ID and define attributes like destination etc.
  2. set events to trace – events that will be collected by trace
  3. set filters if requested – to filter event data
  4. run/stop/remove trace

See detail info bellow.

To create trace use following code.

declare @TraceID int
declare @maxfilesize bigint
declare @rollOver    int = 2
declare @path NVARCHAR(100) = N'C:\Trace\Test'
set @maxfilesize = 1000000 
declare @maxfilecount INT=20
Exec @rc = sp_trace_create @TraceID output, 	@rollOver /*enable rollower*/, @path, @maxfilesize, NULL ,@maxfilecount
if (@rc != 0) goto error

The most important parametres are path, targeting destination of trace data. There are also possibilities to get data outputu to database table or use SQL server profiler application. In example above we use filesystem destination. @TraceID parameter gets ID of trace assigned by system. You use this ID when referencing trace in other routines. You can get ID from system table sys.traces too.

Other parameters defined in our examples are @maxfilesize – you define size of destination file in magabytes. If trace data achieve defined maxfilesize, the trace will be stopped.

In case that you would like to have trace data distributed to more files, because of faster quering, you can set rollover functionality as you can see on our example.  Value set for @rollOver  parameter tells the trace to establish new file one the previous one is full. Through this parameter you can set more options, you can see this link to get detail info.

By creating trace with this procedure we still not getting data. We have to set the trace which data should be collected. So use another one stored procedure to set this option.

declare @on bit
declare @eventID INT = 14
declare @TraceID INT = 2
set @on = 1
exec sp_trace_setevent @TraceID, @eventID, 10, @on
exec sp_trace_setevent @TraceID, @eventID, 3, @on
exec sp_trace_setevent @TraceID, @eventID, 11, @on
exec sp_trace_setevent @TraceID, @eventID, 7, @on
exec sp_trace_setevent @TraceID, @eventID, 8, @on
exec sp_trace_setevent @TraceID, @eventID, 12, @on
exec sp_trace_setevent @TraceID, @eventID, 14, @on
exec sp_trace_setevent @TraceID, @eventID, 35, @on

The first parameter @TraceID is used to set trace ID. @eventID is ID of event you can find in system table  sys.trace_events. In our case eventid 14 is set. It means Audit Login data are collected.

As next parameter you set columns, attributes you would like to collect. You can get list of columns with their ids from system table sys.trace_columns or from MSDN web here. For example on first row where sp_trace_setevent is executed we set 10 which means that application name attribute is collected.

The last parameter for sp_trace_setevent procedure is bit marker. All columns are disabled by default, by specifiying bit to 1 we enable the attribute to be collected.

To filter traces use stored procedure sp_trace_setfilter.

sp_trace_setfilter [ @traceid = ] trace_id   
          , [ @columnid = ] column_id  
          , [ @logical_operator = ] logical_operator  
          , [ @comparison_operator = ] comparison_operator  
          , [ @value = ]

Define @traceid to specify to which trace you would like to apply filter. By specifying @columnid you define on which attribute will be filter applied. @logical_operator means that OR (1), AND (0) filter logic will be applied. @comparison_operator (LIKE, equal, not equal, etc.) defines comparison_operator by its IDs you can find here . IN @value we define value to which should be column compared.

exec sp_trace_setfilter 2, 11, 0, 6,N'sa'

In filter above we set that on our trace monitoring Audit Login event, SA user will be filtered in the trace.  Trace=2, LoginName=11, 0 =AND operator, 6 = LIKE operator, N’sa’ value to be compared.

To handle trace activity, state, there is sp_trace_setstatus stored procedure implemented. See examples bellow.

exec sp_trace_setstatus 2, 0 /*stop trace*/
exec sp_trace_setstatus 2, 1 /*run trace*/
exec sp_trace_setstatus 2, 2 /*remove trace - has to be stoped first*/

Next time look at traces in more detail and compare them with Extended Events. Stay tuned!