– C# word boundary regex instead of .Contains() needed – Education Career Blog

I have a list:

var myList = new List<string> { "red", "blue", "green" };

I have a string:

var myString = "Alfred has a red and blue tie";

I am trying to get a count of matches of words in myList within myString. Currently, I am using .Contains(), which gets me a count of 3 because it is picking up the “red” in “Alfred”. I need to be able to osolate words instead. How can this be achieved?

var count = myList.Where(ml => myString.Contains(ml)); // gets 3, want 2


        var myList = new List<string> { "red", "blue", "green" };
        Regex r = new Regex("\\b(" + string.Join("|", myList.ToArray()) + ")\\b");
        MatchCollection m = r.Matches("Alfred has a red and blue tie");

m.Count will give you the number of times red, blue or green are found. \b specifies word boundary.

Each element of m is of Type Match, and you can look at each index to get more info (ie m0.Value gives you the matched string (red) and m0.Index gives you the location in the original string (13)).


var count = (from s in myList
            join ms in myString.Split() on s equals ms
            select new { s, ms }).Count();


Something like this?

var numMatches = myString.Split().Intersect(myList).Count();

Note that this doesn’t consider duplicate occurrences.

If you do want to consider duplicates, go with @Justin Niessner’s technique.
Here’s an alternative, with an intermediary lookup:

var words = myString.Split().ToLookup(word => word);
var numMatches = myList.Sum(interestingWord => wordsinterestingWord.Count());


this works
I am not sure it is most optimized

Leave a Comment