Skip to content

Categorical Features Support in T/R learners#915

Open
aman-coder03 wants to merge 3 commits into
uber:masterfrom
aman-coder03:fix/categorical-features-meta-learners
Open

Categorical Features Support in T/R learners#915
aman-coder03 wants to merge 3 commits into
uber:masterfrom
aman-coder03:fix/categorical-features-meta-learners

Conversation

@aman-coder03

Copy link
Copy Markdown
Contributor

Proposed changes

Fixes #825
when a pd.DataFrame containing dtype="category" columns is passed to any meta-learner, convert_pd_to_np() was calling .to_numpy() on it, which strips all dtype information. By the time the underlying learner (e.g. XGBoost with enable_categorical=True) received the data, the categorical columns were gone and it would error out.

Two changes fix this...

  1. convert_pd_to_np() in utils.py now skips the numpy conversion for DataFrames that contain at least one categorical column, passing them through as-is
  2. BaseTLearner in tlearner.py no longer converts X to numpy in fit, predict, fit_predict, and estimate_ate
    only treatment and y are converted, since those are used for masking and arithmetic. X is left alone so dtypes survive all the way to the learner

existing behavior is fully preserved, DataFrames without categorical columns still get converted to numpy as before.

NOTE: XGBoost 3.x only supports integer-coded categoricals (pd.Categorical([0, 1, 2, ...])), not string valued ones (pd.Categorical(['a', 'b', 'c', ...])). This is an XGBoost limitation, not a causalml one

Types of changes

What types of changes does your code introduce to CausalML?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc. This PR template is adopted from appium.

@jeongyoonlee jeongyoonlee added the enhancement New feature or request label Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allowing Categorical Features in T/R Learners

2 participants