Skip to content

PERF: Add C++ DetectParamTypes + SQLExecuteFast pipeline#549

Draft
bewithgaurav wants to merge 4 commits intomainfrom
bewithgaurav/insertmany-perf-detect-types
Draft

PERF: Add C++ DetectParamTypes + SQLExecuteFast pipeline#549
bewithgaurav wants to merge 4 commits intomainfrom
bewithgaurav/insertmany-perf-detect-types

Conversation

@bewithgaurav
Copy link
Copy Markdown
Collaborator

Work Item / Issue Reference

AB#<WORK_ITEM_ID>

GitHub Issue: #<ISSUE_NUMBER>


Summary

Move parameter type detection from Python into C++ using raw CPython
type checks (PyLong_CheckExact, PyFloat_CheckExact, etc.). Merge the
DetectParamTypes → BindParameters → SQLExecute pipeline into a single
DDBCSQLExecuteFast call so ParamInfo never crosses the pybind11 boundary.

- DetectParamTypes: handles int (range-detected), float, bool, str
  (unicode + geometry sniffing), bytes, datetime/date/time, Decimal
  (MONEY range + generic numeric), UUID, None, with fallback to string
- SQLExecuteFast_wrap: single pipeline with GIL release, always uses
  SQLPrepare for parameterized queries
- cursor.py: fast path routing when no setinputsizes overrides present;
  old DDBCSQLExecute path preserved for setinputsizes callers
- Named constants: MAX_INLINE_CHAR, MAX_INLINE_BINARY, MAX_NUMERIC_PRECISION,
  MONEY/SMALLMONEY ranges, PARAM_C_TYPE_TEXT platform macro
Comment thread mssql_python/pybind/ddbc_bindings.cpp Fixed
- Add complete DAE (Data-At-Execution) loop to SQLExecuteFast_wrap:
  SQL_NEED_DATA → SQLParamData/SQLPutData for large str/bytes/binary,
  matching the existing SQLExecute_wrap logic exactly
- Fix DAE type assignment: non-unicode DAE strings use SQL_C_CHAR
  (not PARAM_C_TYPE_TEXT which maps to SQL_C_WCHAR on macOS/Linux)
- Fix MONEY range lower bound: use MONEY_MIN not SMALLMONEY_MIN so
  negative decimals in MONEY range bind as VARCHAR (matches Python path)
- Raise TypeError for unknown param types instead of silent str conversion
- Add SQLFreeStmt(SQL_RESET_PARAMS) to unbind after execute
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2026

📊 Code Coverage Report

🔥 Diff Coverage

92%


🎯 Overall Coverage

79%


📈 Total Lines Covered: 7161 out of 9032
📁 Project: mssql-python


Diff Coverage

Diff: main...HEAD, staged and unstaged changes

  • mssql_python/cursor.py (100%)
  • mssql_python/pybind/ddbc_bindings.cpp (92.1%): Missing lines 552-556,575,697-704,2495-2496,2502-2503,2505-2506,2552-2553,2556-2558,2586-2587,2598-2599,2614-2615

Summary

  • Total: 405 lines
  • Missing: 31 lines
  • Coverage: 92%

mssql_python/pybind/ddbc_bindings.cpp

Lines 548-560

  548                     info.paramCType = SQL_C_SBIGINT;
  549                     info.columnSize = 19;
  550                 }
  551             } else {
! 552                 PyErr_Clear();
! 553                 info.paramSQLType = SQL_BIGINT;
! 554                 info.paramCType = SQL_C_SBIGINT;
! 555                 info.columnSize = 19;
! 556             }
  557             info.decimalDigits = 0;
  558             continue;
  559         }

Lines 571-579

  571         if (PyUnicode_CheckExact(obj)) {
  572             Py_ssize_t length = PyUnicode_GET_LENGTH(obj);
  573             unsigned int kind = PyUnicode_KIND(obj);
  574 
! 575             Py_ssize_t utf16_len;
  576             if (kind <= PyUnicode_2BYTE_KIND) {
  577                 utf16_len = length;
  578             } else {
  579                 utf16_len = 0;

Lines 693-708

  693             py::object as_tuple = h.attr("as_tuple")();
  694             py::object exponent_obj = as_tuple.attr("exponent");
  695 
  696             if (py::isinstance<py::str>(exponent_obj)) {
! 697                 info.paramSQLType = SQL_NUMERIC;
! 698                 info.paramCType = SQL_C_NUMERIC;
! 699                 info.columnSize = MAX_NUMERIC_PRECISION;
! 700                 info.decimalDigits = 0;
! 701                 py::object numeric_data = build_numeric_data(py::reinterpret_borrow<py::object>(h));
! 702                 PyList_SET_ITEM(params.ptr(), i, numeric_data.release().ptr());
! 703                 continue;
! 704             }
  705 
  706             py::tuple digits = as_tuple.attr("digits").cast<py::tuple>();
  707             int num_digits = static_cast<int>(py::len(digits));
  708             int exponent = exponent_obj.cast<int>();

Lines 2491-2500

  2491                               py::list is_stmt_prepared,
  2492                               bool /*use_prepare*/,
  2493                               const py::dict& encoding_settings) {
  2494     if (!statementHandle || !statementHandle->get()) {
! 2495         return SQL_INVALID_HANDLE;
! 2496     }
  2497 
  2498     SQLHANDLE hStmt = statementHandle->get();
  2499     std::string charEncoding = "utf-8";
  2500     std::string wcharEncoding = "utf-16le";

Lines 2498-2510

  2498     SQLHANDLE hStmt = statementHandle->get();
  2499     std::string charEncoding = "utf-8";
  2500     std::string wcharEncoding = "utf-16le";
  2501     if (encoding_settings.contains("charEncoding")) {
! 2502         charEncoding = encoding_settings["charEncoding"].cast<std::string>();
! 2503     }
  2504     if (encoding_settings.contains("wcharEncoding")) {
! 2505         wcharEncoding = encoding_settings["wcharEncoding"].cast<std::string>();
! 2506     }
  2507 
  2508     RETCODE rc;
  2509     bool already_prepared = is_stmt_prepared[0].cast<bool>();

Lines 2548-2562

  2548                     break;
  2549                 }
  2550             }
  2551             if (!matchedInfo) {
! 2552                 ThrowStdException("SQLExecuteFast: unrecognized paramToken from SQLParamData");
! 2553             }
  2554             const py::object& pyObj = matchedInfo->dataPtr;
  2555             if (pyObj.is_none()) {
! 2556                 SQLPutData_ptr(hStmt, nullptr, 0);
! 2557                 continue;
! 2558             }
  2559 
  2560             if (py::isinstance<py::str>(pyObj)) {
  2561                 if (matchedInfo->paramCType == SQL_C_WCHAR) {
  2562                     std::wstring wstr = pyObj.cast<std::wstring>();

Lines 2582-2591

  2582                     try {
  2583                         py::object encoded = pyObj.attr("encode")(charEncoding, "strict");
  2584                         encodedStr = encoded.cast<std::string>();
  2585                     } catch (const py::error_already_set&) {
! 2586                         throw;
! 2587                     }
  2588                     const char* dataPtr = encodedStr.data();
  2589                     size_t totalBytes = encodedStr.size();
  2590                     for (size_t offset = 0; offset < totalBytes; offset += DAE_CHUNK_SIZE) {
  2591                         size_t len = std::min(static_cast<size_t>(DAE_CHUNK_SIZE),

Lines 2594-2603

  2594                                             static_cast<SQLLEN>(len));
  2595                         if (!SQL_SUCCEEDED(rc)) return rc;
  2596                     }
  2597                 } else {
! 2598                     ThrowStdException("SQLExecuteFast: unsupported C type for str in DAE");
! 2599                 }
  2600             } else if (py::isinstance<py::bytes>(pyObj) ||
  2601                        py::isinstance<py::bytearray>(pyObj)) {
  2602                 py::bytes b = pyObj.cast<py::bytes>();
  2603                 std::string s = b;

Lines 2610-2619

  2610                                         static_cast<SQLLEN>(len));
  2611                     if (!SQL_SUCCEEDED(rc)) return rc;
  2612                 }
  2613             } else {
! 2614                 ThrowStdException("SQLExecuteFast: DAE only supported for str or bytes");
! 2615             }
  2616         }
  2617         if (!SQL_SUCCEEDED(rc) && rc != SQL_NO_DATA) return rc;
  2618     }


📋 Files Needing Attention

📉 Files with overall lowest coverage (click to expand)
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.pybind.ddbc_bindings.h: 67.9%
mssql_python.row.py: 70.5%
mssql_python.pybind.logger_bridge.hpp: 70.8%
mssql_python.pybind.ddbc_bindings.cpp: 74.9%
mssql_python.pybind.connection.connection.cpp: 76.2%
mssql_python.__init__.py: 77.3%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.connection.py: 85.3%

🔗 Quick Links

⚙️ Build Summary 📋 Coverage Details

View Azure DevOps Build

Browse Full Coverage Report

- Comment out use_prepare parameter name (C4100: unreferenced parameter)
- Remove unused catch variable name (C4101: unreferenced local variable)
Add explicit null pointer and zero-length guards before memcpy in
build_numeric_data to satisfy DevSkim code scanning rule DS121708.
std::memset(&nd.val[0], 0, SQL_MAX_NUMERIC_LEN);
size_t copy_len = std::min(val_str.size(), static_cast<size_t>(SQL_MAX_NUMERIC_LEN));
if (copy_len > 0 && val_str.data() != nullptr) {
std::memcpy(&nd.val[0], val_str.data(), copy_len);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants