Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV when calling OrtSession.run() #24288

Open
wy8023123 opened this issue Apr 3, 2025 · 1 comment
Open

SIGSEGV when calling OrtSession.run() #24288

wy8023123 opened this issue Apr 3, 2025 · 1 comment
Labels
api:Java issues related to the Java API

Comments

@wy8023123
Copy link

Describe the issue

I am using Silero-VAD for voice activity detection (VAD),Most of the time I'm able to create an OrtSession and retrieve an OrtSession.Result。
Sometimes,when I call OrtSession.run() running in Java on Linux, I get a SIGSEGV

This is my has_err_pid.log.

hs_err_pid13113.log

To reproduce

Here is my java code,use silero-vad tagV4.0。(https://github.com/snakers4/silero-vad/tree/v4.0)

`public class SileroVadDetector {
// OnnxModel model used for speech processing
private final SileroVadOnnxModel model;
// Threshold for speech start
private final float startThreshold;
// Threshold for speech end
private final float endThreshold;
// Sampling rate
private final int samplingRate;
// Minimum number of silence samples to determine the end threshold of speech
private final float minSilenceSamples;
// Additional number of samples for speech start or end to calculate speech start or end time
private final float speechPadSamples;
// Whether in the triggered state (i.e. whether speech is being detected)
private boolean triggered;
// Temporarily stored number of speech end samples
private int tempEnd;
// Number of samples currently being processed
private int currentSample;

public SileroVadDetector(String modelPath,
                         float startThreshold,
                         float endThreshold,
                         int samplingRate,
                         int minSilenceDurationMs,
                         int speechPadMs) throws OrtException {
    // Check if the sampling rate is 8000 or 16000, if not, throw an exception
    if (samplingRate != 8000 && samplingRate != 16000) {
        throw new IllegalArgumentException("does not support sampling rates other than [8000, 16000]");
    }

    // Initialize the parameters
    this.model = new SileroVadOnnxModel(modelPath);
    this.startThreshold = startThreshold;
    this.endThreshold = endThreshold;
    this.samplingRate = samplingRate;
    this.minSilenceSamples = samplingRate * minSilenceDurationMs / 1000f;
    this.speechPadSamples = samplingRate * speechPadMs / 1000f;
    // Reset the state
    reset();
    log.info("silero-vad detector has initialized! startThreshold:{} endThreshold:{} samplingRate:{} minSilenceSamples:{} speechPadSamples:{}", startThreshold, endThreshold, samplingRate, minSilenceSamples, speechPadSamples);
}

// Method to reset the state, including the model state, trigger state, temporary end time, and current sample count
public void reset() {
    model.resetStates();
    triggered = false;
    tempEnd = 0;
    currentSample = 0;
}

// apply method for processing the audio array, returning possible speech start or end times
public Map<String, Double> apply(byte[] data, boolean returnSeconds) {

    // Convert the byte array to a float array
    float[] audioData = new float[data.length / 2];
    for (int i = 0; i < audioData.length; i++) {
        audioData[i] = ((data[i * 2] & 0xff) | (data[i * 2 + 1] << 8)) / 32767.0f;
    }

    // Get the length of the audio array as the window size
    int windowSizeSamples = audioData.length;
    // Update the current sample count
    currentSample += windowSizeSamples;

    // Call the model to get the prediction probability of speech
    float speechProb = 0;
    try {
        speechProb = model.call(new float[][]{audioData}, samplingRate)[0];
    } catch (OrtException e) {
        throw new RuntimeException(e);
    }

    // If the speech probability is greater than the threshold and the temporary end time is not 0, reset the temporary end time
    // This indicates that the speech duration has exceeded expectations and needs to recalculate the end time
    if (speechProb >= startThreshold && tempEnd != 0) {
        tempEnd = 0;
    }

    // If the speech probability is greater than the threshold and not in the triggered state, set to triggered state and calculate the speech start time
    if (speechProb >= startThreshold && !triggered) {
        triggered = true;
        int speechStart = (int) (currentSample - speechPadSamples);
        speechStart = Math.max(speechStart, 0);
        Map<String, Double> result = new HashMap<>();
        // Decide whether to return the result in seconds or sample count based on the returnSeconds parameter
        if (returnSeconds) {
            double speechStartSeconds = speechStart / (double) samplingRate;
            double roundedSpeechStart = BigDecimal.valueOf(speechStartSeconds).setScale(1, RoundingMode.HALF_UP).doubleValue();
            result.put("start", roundedSpeechStart);
        } else {
            result.put("start", (double) speechStart);
        }

        return result;
    }

    // If the speech probability is less than a certain threshold and in the triggered state, calculate the speech end time
    if (speechProb < endThreshold && triggered) {
        // Initialize or update the temporary end time
        if (tempEnd == 0) {
            tempEnd = currentSample;
        }
        // If the number of silence samples between the current sample and the temporary end time is less than the minimum silence samples, return null
        // This indicates that it is not yet possible to determine whether the speech has ended
        if (currentSample - tempEnd < minSilenceSamples) {
            return Collections.emptyMap();
        } else {
            // Calculate the speech end time, reset the trigger state and temporary end time
            int speechEnd = (int) (tempEnd + speechPadSamples);
            tempEnd = 0;
            triggered = false;
            Map<String, Double> result = new HashMap<>();

            if (returnSeconds) {
                double speechEndSeconds = speechEnd / (double) samplingRate;
                double roundedSpeechEnd = BigDecimal.valueOf(speechEndSeconds).setScale(1, RoundingMode.HALF_UP).doubleValue();
                result.put("end", roundedSpeechEnd);
            } else {
                result.put("end", (double) speechEnd);
            }
            return result;
        }
    }

    // If the above conditions are not met, return null by default
    return Collections.emptyMap();
}

public void close() throws OrtException {
    reset();
    model.close();
}

`

`public class SileroVadOnnxModel {
// Define private variable OrtSession
private final OrtSession session;
private float[][][] h;
private float[][][] c;
// Define the last sample rate
private int lastSr = 0;
// Define the last batch size
private int lastBatchSize = 0;
// Define a list of supported sample rates
private static final List SAMPLE_RATES = Arrays.asList(8000, 16000);

// Constructor
public SileroVadOnnxModel(String modelPath) throws OrtException {
    // Get the ONNX runtime environment
    OrtEnvironment env = OrtEnvironment.getEnvironment();
    // Create an ONNX session options object
    OrtSession.SessionOptions opts = new OrtSession.SessionOptions();
    // Set the InterOp thread count to 1, InterOp threads are used for parallel processing of different computation graph operations
    opts.setInterOpNumThreads(1);
    // Set the IntraOp thread count to 1, IntraOp threads are used for parallel processing within a single operation
    opts.setIntraOpNumThreads(1);
    // Add a CPU device, setting to false disables CPU execution optimization
    opts.addCPU(true);
    // Create an ONNX session using the environment, model path, and options
    session = env.createSession(modelPath, opts);
    // Reset states
    resetStates();
}

/**
 * Reset states
 */
void resetStates() {
    h = new float[2][1][64];
    c = new float[2][1][64];
    lastSr = 0;
    lastBatchSize = 0;
}

public void close() throws OrtException {
    session.close();
}

/**
 * Define inner class ValidationResult
 */
public static class ValidationResult {
    public final float[][] x;
    public final int sr;

    // Constructor
    public ValidationResult(float[][] x, int sr) {
        this.x = x;
        this.sr = sr;
    }
}

/**
 * Function to validate input data
 */
private ValidationResult validateInput(float[][] x, int sr) {
    // Process the input data with dimension 1
    if (x.length == 1) {
        x = new float[][]{x[0]};
    }
    // Throw an exception when the input data dimension is greater than 2
    if (x.length > 2) {
        throw new IllegalArgumentException("Incorrect audio data dimension: " + x[0].length);
    }

    // Process the input data when the sample rate is not equal to 16000 and is a multiple of 16000
    if (sr != 16000 && (sr % 16000 == 0)) {
        int step = sr / 16000;
        float[][] reducedX = new float[x.length][];

        for (int i = 0; i < x.length; i++) {
            float[] current = x[i];
            float[] newArr = new float[(current.length + step - 1) / step];

            for (int j = 0, index = 0; j < current.length; j += step, index++) {
                newArr[index] = current[j];
            }

            reducedX[i] = newArr;
        }

        x = reducedX;
        sr = 16000;
    }

    // If the sample rate is not in the list of supported sample rates, throw an exception
    if (!SAMPLE_RATES.contains(sr)) {
        throw new IllegalArgumentException("Only supports sample rates " + SAMPLE_RATES + " (or multiples of 16000)");
    }

    // If the input audio block is too short, throw an exception
    if (((float) sr) / x[0].length > 31.25) {
        throw new IllegalArgumentException("Input audio is too short");
    }

    // Return the validated result
    return new ValidationResult(x, sr);
}

/**
 * Method to call the ONNX model
 */
public float[] call(float[][] x, int sr) throws OrtException {
    ValidationResult result = validateInput(x, sr);
    x = result.x;
    sr = result.sr;

    int batchSize = x.length;

    if (lastBatchSize == 0 || lastSr != sr || lastBatchSize != batchSize) {
        resetStates();
    }

    OrtEnvironment env = OrtEnvironment.getEnvironment();

    OnnxTensor inputTensor = null;
    OnnxTensor hTensor = null;
    OnnxTensor cTensor = null;
    OnnxTensor srTensor = null;
    OrtSession.Result ortOutputs = null;

    try {
        // Create input tensors
        inputTensor = OnnxTensor.createTensor(env, x);
        hTensor = OnnxTensor.createTensor(env, h);
        cTensor = OnnxTensor.createTensor(env, c);
        srTensor = OnnxTensor.createTensor(env, new long[]{sr});

        Map<String, OnnxTensor> inputs = new HashMap<>();
        inputs.put("input", inputTensor);
        inputs.put("sr", srTensor);
        inputs.put("h", hTensor);
        inputs.put("c", cTensor);

        // Call the ONNX model for calculation
        ortOutputs = session.run(inputs);
        // Get the output results
        float[][] output = (float[][]) ortOutputs.get(0).getValue();
        h = (float[][][]) ortOutputs.get(1).getValue();
        c = (float[][][]) ortOutputs.get(2).getValue();

        lastSr = sr;
        lastBatchSize = batchSize;
        return output[0];
    } finally {
        if (inputTensor != null) {
            inputTensor.close();
        }
        if (hTensor != null) {
            hTensor.close();
        }
        if (cTensor != null) {
            cTensor.close();
        }
        if (srTensor != null) {
            srTensor.close();
        }
        if (ortOutputs != null) {
            ortOutputs.close();
        }
    }
}

}`

Urgency

No response

Platform

Linux

OS Version

CentOS Linux release 7.9.2009

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.12.1

ONNX Runtime API

Java

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

@github-actions github-actions bot added the api:Java issues related to the Java API label Apr 3, 2025
@Craigacp
Copy link
Contributor

Craigacp commented Apr 3, 2025

Can you build the ORT native library with debug symbols and rerun it? Without the stack trace into the native code it's hard to see what's going on. Also does it fail with a newer version? 1.12.1 is pretty old and we've fixed a bunch of bugs since then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api:Java issues related to the Java API
Projects
None yet
Development

No branches or pull requests

2 participants