Skip to content

Impact Levels in Validoopsie

Validoopsie uses impact levels to determine how validation failures should be handled. Properly setting the impact level helps control the behavior of your data validation pipeline.

Available Impact Levels

Validoopsie supports three impact levels:

Level Description Behavior
"low" Minor validation issues that don't significantly affect data quality Records failures in results but doesn't interrupt processing
"medium" Moderate validation issues that may affect data quality Records failures in results but doesn't interrupt processing
"high" Critical validation issues that severely affect data quality Raises a ValueError exception when .validate() is called

How to Set Impact Levels

Impact levels can be set on any validation:

from validoopsie import Validate
import pandas as pd

df = pd.DataFrame({
    "id": [1, 2, 3, 4, 5],
    "value": [10, 20, 30, 200, 40]
})

validator = Validate(df)

# Low impact - will be recorded but won't raise exceptions
validator.ValuesValidation.ColumnValuesToBeBetween(
    column="value", 
    min_value=0, 
    max_value=100,
    impact="low"  # This is the default if not specified
)

# Medium impact - generally used for important but not critical validations
validator.TypeValidation.TypeCheck(
    column="id",
    type_name="int64",
    impact="medium"
)

# High impact - will raise an exception when validate() is called
validator.NullValidation.ColumnNotBeNull(
    column="id",
    impact="high"  # This will cause validate() to raise an exception if nulls are found
)

# This will raise an exception if any high-impact validations fail
try:
    validator.validate()
except ValueError as e:
    print(f"Validation failed: {e}")

When to Use Each Impact Level

  • Low: Use for informational validations or when failures can be tolerated

    • Example: Checking if string values follow a preferred format
    • Example: Verifying that numerical values fall within suggested (but not required) ranges
  • Medium: Use for important validations where failures should be highlighted

    • Example: Checking if dates fall within an expected range
    • Example: Verifying that categorical values belong to an expected set
  • High: Use for critical validations where failures should interrupt processing

    • Example: Ensuring primary key columns don't contain null values
    • Example: Verifying that required fields are present
    • Example: Checking data integrity constraints that must be satisfied

Default Impact Level

If not specified, all validations default to an impact level of "low".