INTERVIEW QUESTION - Count the frequency of words appearing in a string Using Python

3 min read 15 days ago
Published on Sep 15, 2024 This response is partially generated with the help of AI. It may contain inaccuracies.

Table of Contents

Introduction

In this tutorial, we will learn how to count the frequency of words appearing in a string using Python. This is a common interview question that tests your understanding of string manipulation and data structures. By the end of this guide, you will be able to implement a function that efficiently counts word occurrences in any given string.

Step 1: Set Up Your Python Environment

Before diving into the code, ensure that you have Python installed on your computer. You can download it from the official Python website.

  • Open your code editor or IDE (like PyCharm, VSCode, or Jupyter Notebook).
  • Create a new Python file named word_frequency.py.

Step 2: Define the Function

Let's start by defining a function that will take a string as input and return a dictionary containing the frequency of each word.

def word_frequency(input_string):
    # Convert the string to lowercase to make the count case insensitive
    input_string = input_string.lower()
    
    # Remove punctuation from the string
    for char in '-.,\n':
        input_string = input_string.replace(char, ' ')
    
    # Split the string into words
    words = input_string.split()
    
    # Create a dictionary to hold word counts
    frequency = {}
    
    # Count the frequency of each word
    for word in words:
        if word in frequency:
            frequency[word] += 1
        else:
            frequency[word] = 1
            
    return frequency

Practical Tips

  • Using lower() ensures that the counting is case insensitive, meaning "Word" and "word" will be counted as the same.
  • Removing punctuation is essential to avoid counting words like "word." and "word," as separate entities.

Step 3: Test the Function

Now that we have defined our function, let's test it with an example string.

if __name__ == "__main__":
    test_string = "Hello world! Hello everyone. Welcome to the world of Python."
    result = word_frequency(test_string)
    print(result)

Common Pitfalls

  • Make sure to test your function with various strings, including those with different casing and punctuation, to ensure accuracy.
  • Watch out for trailing spaces or special characters that might affect word splitting.

Step 4: Analyze the Output

When you run the script, you should see the frequency count of each word printed as a dictionary. For example, the output might look like this:

{'hello': 2, 'world': 2, 'everyone': 1, 'welcome': 1, 'to': 1, 'the': 1, 'of': 1, 'python': 1}

Real-World Application

This function can be useful in various scenarios, such as:

  • Analyzing text data in natural language processing.
  • Generating word clouds to visualize text data.
  • Conducting data analysis for text-based applications.

Conclusion

In this tutorial, we learned how to count the frequency of words in a string using Python. We defined a function that handles punctuation and case sensitivity, tested it with an example, and analyzed the output. You can enhance this function further by adding features such as ignoring common stop words or sorting the results by frequency. Happy coding!