Skip to content

Support quantize_ for convolution on XNNPACK #17432

@GregoryComer

Description

@GregoryComer

XNNPACK's partitioner and lowering logic doesn't recognize the quantize_ (de)quantize_affine representation in the IR for convolution nodes. We need to wire this up.

Example quantization code:

conv_config = IntxWeightOnlyConfig(
        weight_dtype=torch.int8,
        granularity=PerAxis(0),
    )
    quantize_(
        model,
        conv_config,
        lambda m, fqn: isinstance(m, torch.nn.Conv2d),
    ) 

cc @digantdesai @cbilgin

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: xnnpackIssues related to xnnpack delegation and the code under backends/xnnpack/

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions