Tell me, how does the transfer of the table parameter to the procedure from the DataTable ?

I'm interested in:

1. Performance

Will it be faster than performing single instructions?

For example, there are 1000 values ​​in a DataTable and they need to be inserted.

In one case, I can make 1000 INSERT from code, and in the other I can send this DataTable entirely to the procedure.

2. Restriction

Are there any restrictions on the size of the passed-in table argument?

For example, INSERT INTO VALUES can only insert 1000 at a time.

3. How is the Tabular variable filled?

This variable is created on the server and it is filled with wild INSERT INTO instructions from my DataTable or something more interesting, for example does INSERT INTO VALUES with 1000 entries?

    1 answer 1

    Referring to item 1 : When inserting with single INSERT , it seems that the best thing that can be done from the .Net side to speed up the insertion is the corresponding request to parameterize and prepare ( cmd.Prepare() before entering the loop in the following example):

     using (var cmd = new SqlCommand()) { cmd.Connection = connection; cmd.CommandText = "insert into [Table] (ID, Name) values (@id, @name)"; var p = new SqlParameter("@id", SqlDbType.Int); var p2 = new SqlParameter("@name", SqlDbType.VarChar, 20); cmd.Parameters.Add(p); cmd.Parameters.Add(p2); cmd.Prepare(); for (int i = 1; i <= rowCount; i++) { p.Value = i; p2.Value = $"Name {i}"; cmd.ExecuteNonQuery(); } } 

    In this case, thanks to the parameterization, the parsing and compilation of the query will occur once - the first time ExecuteNonQuery() call ExecuteNonQuery() , and thanks to the preparation, the query text itself will be transmitted once. His handle (received on the first call) will be retransmitted, which can be especially beneficial if the request text is long.

    However, if there are many lines, even with parameterization and preparation, there will be noticeable costs for organizing the call, since with each call (for each transmitted data line) not only useful data is transmitted, but also auxiliary (metadata such as, for example, the names of parameters and their types). And the result (even if it is just a successful completion code) is also returned for each row.

    In the case of transferring the inserted data through a table variable.

     DataTable dt = new DataTable(); dt.Columns.Add("ID", typeof(int)); dt.Columns.Add("Name", typeof(string)); for (int i = 1; i <= rowCount; i++) dt.Rows.Add(i, $"Name {i}"); using (var cmd = new SqlCommand()) { cmd.Connection = connection; cmd.CommandText = @"insert into [Table] (ID, Name) select ID, Name from @data"; var p = new SqlParameter("@data", SqlDbType.Structured); p.TypeName = "dbo.TableType"; p.Value = dt; cmd.Parameters.Add(p); cmd.ExecuteNonQuery(); } 

    Metadata describing the contents of a table parameter is not transmitted for each row, but once for all rows of a table variable. And the result is also returned one for all rows at once. What gives the gain the greater, the more rows are inserted.

    With this approach, however, you can expect some performance loss on “shifting” data (after all, data is inserted into the target table from a table variable, and not directly, as is the case with single INSERT ).

    Therefore - yes, inserting data through a table variable will be faster than inserting single INSERT , but, most likely, only after a certain number of rows.

    The simulation confirms this. For the examples given,

    insert 5 lines:

     M1: 132 msec M2: 19 msec M3: 6 msec M4: 34 msec 

    insert 10,000 lines:

     M1: 13614 msec M2: 7555 msec M3: 5020 msec M4: 115 msec 

    where M1, M2, M3 and M4 are various methods:

    • M1 - single inserts, not parameterized query (insertion of literals)
    • M2 - single inserts, the parameterized query, but not prepared
    • M3 - single inserts, parameterized and prepared query
    • M4 - insert from table variable

    Clause 2 : A table variable can contain from 1 to 1024 columns. The number of lines is arbitrary and unlimited.

    On item 3 : The tabular parameter is transferred to the server serialized into the TDS stream. On the server side, it is deserialized.

    The analysis by the profiler on the SqlServer side shows that during deserialization, the corresponding table variable is created, which is then filled in approximately as follows (pseudo-code):

     insert into @tableVariable (Column1, Column2, ...) select Column1, Column2, ... from TableValuedFunction() 

    where TableValuedFunction is apparently a kind of table wrapper function over the incoming data stream. Those. data in a table variable comes almost directly from the incoming request data stream, bypassing language constructs like insert into ... values ...

    • In the case of the M3, was the Prepare call included in the piece of code being measured? - PashaPash ♦
    • And the user type, which is also a table variable, should always be previously created on the SQL Server side, or can this be somehow circumvented? - iluxa1810
    • @PashaPash, yes, Prepare was turned on (I measured the total execution time, i.e. each of M contained the connection opening, truncate target table, then the corresponding piece of code). - i-one
    • one
      @ iluxa1810, yes, the type must be created, in my opinion it is impossible to bypass it. In general, if there is the possibility of direct insertion, then it is better to use bulkcopy (there will be no “shifting” of the data and no need to create a type). - i-one
    • one
      @ i-one, you can still try to measure the speed with the use of NO COUNT. In theory, in single inserts there will be a small profit, since the server does not return anything back. - iluxa1810 pm