In lockstep with the new rust release, we released pyo3-polars 0.8.

This comes with a new feature that lets you compile a rust function and expose it as expression that is dynamically linked into the polars version released on pypi.

This means that this function will be called by polars’ engine and will benefit from parallelism, rust performance, and optimizations, without any GIL locking.

To expose a function as an expression you tag it with a polars_expr proc macro.

fn pig_latin_str(value: &str, capitalize: bool, output: &mut String) {
    if let Some(first_char) = value.chars().next() {
        if capitalize {
            for c in value.chars().skip(1).map(|char| char.to_uppercase()) {
                write!(output, "{c}").unwrap()
            }
            write!(output, "AY").unwrap()
        } else {
            let offset = first_char.len_utf8();
            write!(output, "{}{}ay", &value[offset..], first_char).unwrap()
        }
    }
}

#[derive(Deserialize)]
struct PigLatinKwargs {
    capitalize: bool,
}

#[polars_expr(output_type=Utf8)]
fn pig_latinnify(inputs: &[Series], kwargs: PigLatinKwargs) -> PolarsResult {
    let ca = inputs[0].utf8()?;
    let out: Utf8Chunked =
        ca.apply_to_buffer(|value, output| pig_latin_str(value, kwargs.capitalize, output));
    Ok(out.into_series())
}

On the python side this can then be registered as follows:

import polars as pl
from polars.utils.udfs import _get_shared_lib_location

lib = _get_shared_lib_location(__file__)


@pl.api.register_expr_namespace("language")
class Language:
    def __init__(self, expr: pl.Expr):
        self._expr = expr

    def pig_latinnify(self, capatilize: bool = False) -> pl.Expr:
        return self._expr._register_plugin(
            lib=lib,
            symbol="pig_latinnify",
            is_elementwise=True,
            kwargs={"capitalize": capatilize}
        )

And then, compile with maturin, install pip install polars and you are good to go.

import polars as pl
import expression_lib

df = pl.DataFrame({
    "names": ["Richard", "Alice", "Bob"],
})


out = df.with_columns(
   pig_latin = pl.col("names").language.pig_latinnify()
)

See the full examples here: https://github.com/pola-rs/pyo3-polars/tree/main/example/derive_expression

Or take a look at the plugins page in the user guide: https://pola-rs.github.io/polars/user-guide/expressions/plugins/

  • marsupiq@fediverser.communick.devB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I started learning Rust a year ago, but then stopped thinking: This looks amazing, but I will never have any use case for it…

    I guess this just changed. :)