String Manipulation Utilities Documentation
Introduction
This Python module provides a set of utility functions to manipulate strings, making it easier to handle text processing tasks such as cleaning strings, removing comments, and extracting tables from queries. These utilities are part of the bigquery-advanced-utils package and aim to simplify text processing in data workflows.
Class Initialization
To use the String utilities, you first need to create an instance of the class:
from bigquery_advanced_utils.utils.string import String
string_utils = String()The String class does not require any parameters during initialization. Once instantiated, you can call its methods for various string manipulation tasks.
Methods Overview
remove_chars_from_string
Description
Removes specific characters from a given string.
Parameters
string(str): The input string to process.chars_to_remove(list[str]): A list of characters to remove from the input string.
Returns
str: A new string with the specified characters removed.
Raises
InvalidArgumentToFunction: If the input parameters are invalid (e.g.,stringisNone, orchars_to_removeis not a non-empty list).
Example
from bigquery_advanced_utils.utils.string import String
string_utils = String()
input_string = "Hello, World!"
chars_to_remove = [",", "!"]
try:
cleaned_string = string_utils.remove_chars_from_string(input_string, chars_to_remove)
print(cleaned_string) # Output: "Hello World"
except InvalidArgumentToFunction:
print("Invalid arguments provided to the function.")remove_comments_from_string
Description
Removes all comments from a given string based on the specified dialect (default is standard_sql).
Parameters
string(str): The input string, typically a SQL query.dialect(str): The dialect to use for identifying comments. Default isstandard_sql.
Returns
str: The string with all comments removed.
Raises
InvalidArgumentToFunction: If the input parameters are invalid (e.g.,stringisNone).
Example
input_query = """SELECT * FROM table -- This is a comment
WHERE id = 1;"""
try:
cleaned_query = string_utils.remove_comments_from_string(input_query)
print(cleaned_query) # Output: "SELECT * FROM table WHERE id = 1;"
except InvalidArgumentToFunction:
print("Invalid arguments provided to the function.")extract_tables_from_query
Description
Extracts all table names from a SQL query string.
Parameters
string(str): The input SQL query string.
Returns
list[str]: A list of unique table names extracted from the query.
Raises
InvalidArgumentToFunction: If the input string isNone.
Example:
query = """SELECT * FROM `project.dataset.table` JOIN `project.dataset.table2` ON id = id
-- Commented out code
WHERE condition;"""
try:
tables = string_utils.extract_tables_from_query(query)
print(tables) # Output: ["project.dataset.table", "project.dataset.table2"]
except InvalidArgumentToFunction:
print("Invalid arguments provided to the function.")