Skip to content

Types: Add support for BINARY columns#12

Draft
amotl wants to merge 4 commits into
mainfrom
amo/more-types
Draft

Types: Add support for BINARY columns#12
amotl wants to merge 4 commits into
mainfrom
amo/more-types

Conversation

@amotl

@amotl amotl commented Dec 23, 2023

Copy link
Copy Markdown
Contributor

About

Improvements to be mainlined from https://github.com/pyveci/supertask/blob/19316e2/supertask/store/cratedb.py. The support for binary data types is being emulated by serializing them to STRING using base64. In that way, pickled data can be stored and retrieved without further ado.

@amotl amotl changed the title Types: Add support for BINARY columns and improve support for FLOATs [TYPES] Add support for BINARY columns and improve support for FLOATs Dec 23, 2023

@matriv matriv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind that CrateDB also supports BitString type, but I don't think that's what we want here, right?

@amotl

amotl commented Dec 24, 2023

Copy link
Copy Markdown
Contributor Author

Keep in mind that CrateDB also supports BitString type, but I don't think that's what we want here, right?

Right. Here, it is about providing an emulation for a BLOB-type column. However, expanding type support to include the BIT type will be nice to have for the sake of completeness. So far, I didn't need it, but I am sure others will.

Comment on lines +17 to +30
def bind_processor(self, dialect):
if dialect.dbapi is None:
return None

# TODO: DBAPIBinary = dialect.dbapi.Binary

def process(value):
if value is not None:
# TODO: return DBAPIBinary(value)
return base64.b64encode(value).decode()
else:
return None

return process

@amotl amotl Dec 24, 2023

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self, or others who want to pick this up:

Review that detail about returning a DBAPIBinary, or not.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl Did you review it?

@amotl amotl Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I just staged the patch directly from its origin, where the code is in use and works well. However, it can still be wrong from a generic perspective.

Let's toggle this patch back into draft mode: Better safe than sorry, thanks! The other commit 465b55b has been removed here and diverged to a separate patch.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bgunebakan,

Review that detail about returning a DBAPIBinary, or not.

No matter how the downstream supertask package currently consumes this API, I think we should aim for making it standards-compliant and then adjust supertask correspondingly, when possible. In this spirit, I think this comment is the most prominent example what we are referring to:

# TODO: return DBAPIBinary(value) 

Only because we "unlocked" this feature by harmonizing sqlalchemy-cratedb with supertask in a bare-bones way to satisfy some basic explorations the other day doesn't automatically mean that the currently implemented interface is correct yet.

With kind regards,
Andreas.

Comment on lines +32 to +44
# Python 3 has native bytes() type
# both sqlite3 and pg8000 seem to return it,
# psycopg2 as of 2.5 returns 'memoryview'
def result_processor(self, dialect, coltype):
if dialect.returns_native_bytes:
return None

def process(value):
if value is not None:
return base64.b64decode(value)
return value

return process

@amotl amotl Dec 24, 2023

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self, or others who want to pick this up:

Review that detail about dialect.returns_native_bytes: Should it be handled differently, because, well, base64.b64decode actually returns native bytes already?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl Did you review it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment. I've responded more extensively above. Yes, I agree this patch should get more love before considering it ready, optimally in harmony with its consumer site in supertask.

@amotl amotl changed the base branch from amo/postgresql-async to amo/fix-inspector January 15, 2024 21:34
@amotl amotl force-pushed the amo/more-types branch 2 times, most recently from 9941c5d to 2a34862 Compare January 15, 2024 21:36
@amotl amotl force-pushed the amo/fix-inspector branch from 2f08b2a to 43c45fb Compare January 15, 2024 21:50
@amotl amotl changed the title [TYPES] Add support for BINARY columns and improve support for FLOATs Types: Add support for BINARY columns and improve support for FLOATs Jan 15, 2024
@amotl amotl force-pushed the amo/fix-inspector branch 5 times, most recently from 6ac0a22 to d2c7613 Compare June 13, 2024 14:33
Base automatically changed from amo/fix-inspector to main June 13, 2024 14:38
@amotl amotl force-pushed the amo/more-types branch 3 times, most recently from 8f01308 to 73bfe8e Compare June 25, 2024 14:51
@amotl amotl force-pushed the amo/more-types branch 3 times, most recently from 2707929 to de288ab Compare March 30, 2025 02:20
@amotl amotl force-pushed the amo/more-types branch 2 times, most recently from 09c55fc to 9d1fc10 Compare March 30, 2025 03:54
Comment thread tests/test_support_pandas.py Fixed
@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5c41c1ac-be88-4447-a012-55aa02a3e31e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch amo/more-types

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@amotl amotl requested review from bgunebakan and removed request for hlcianfagna and surister June 2, 2026 20:42
@amotl amotl marked this pull request as ready for review June 2, 2026 20:42
@amotl amotl requested a review from kneth June 3, 2026 10:02

@kneth kneth left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need tests for BLOBs.

if dialect.dbapi is None:
return None

# TODO: DBAPIBinary = dialect.dbapi.Binary

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be removed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or activated, to be more standards-compliant. Let's investigate within another iteration before getting things wrong. Thanks!

class LargeBinary(sa.String):
"""A type for large binary byte data.

The :class:`.LargeBinary` type corresponds to a large and/or unlengthed

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that we also mention CrateDB' BLOB type here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CrateDB does not provide a BLOB type, does it?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BLOB might require some tweaks to add it to our SQL Alchemy variant. SinceBLOB is only working over HTTP(s), it could be natural to add support for it.

@amotl amotl Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. The CrateDB BLOB support you are mentioning is not made through a data type, but a table variant instead. In this spirit, it is a totally different animal.

Comment on lines +17 to +30
def bind_processor(self, dialect):
if dialect.dbapi is None:
return None

# TODO: DBAPIBinary = dialect.dbapi.Binary

def process(value):
if value is not None:
# TODO: return DBAPIBinary(value)
return base64.b64encode(value).decode()
else:
return None

return process

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl Did you review it?

Comment on lines +32 to +44
# Python 3 has native bytes() type
# both sqlite3 and pg8000 seem to return it,
# psycopg2 as of 2.5 returns 'memoryview'
def result_processor(self, dialect, coltype):
if dialect.returns_native_bytes:
return None

def process(value):
if value is not None:
return base64.b64decode(value)
return value

return process

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl Did you review it?

@amotl amotl marked this pull request as draft June 4, 2026 16:30
@amotl amotl changed the title Types: Add support for BINARY columns and improve support for FLOATs Types: Add support for BINARY columns Jun 4, 2026
@amotl amotl force-pushed the amo/more-types branch from 465b55b to 4fd1f77 Compare June 4, 2026 16:35
@amotl amotl force-pushed the amo/more-types branch from 4fd1f77 to b504343 Compare June 4, 2026 21:09
@bgunebakan bgunebakan self-assigned this Jun 12, 2026
@bgunebakan bgunebakan added the enhancement New feature or request label Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants