Software Venture Consulting

FileMaker Pro downloads & Resources
FileMaker Custom Functions
FileMaker Web Viewer Examples
FileMaker Pro & Lasso Consulting
Training
FileMaker Books
FileMaker Articles
FileMaker Error Reference

Free Web Tools
Free FileMaker Tools

Personal Pages
Videos
Adventures
Links

Shopping Cart
Shopping Cart

Search:

Free Newsletter
Signup


Contact


Privacy Policy



FileMaker is a registered trademark of FileMaker, Inc. in the U.S. and other countries.

 

 FileMaker Pro Custom Functions

List  |  Show Random  |  Upload  |  Add This to Your Site

uniqueWords ( anyText ; minimWordLength ; caseSensitive ; shouldSkipNumbers )

Rate this function:  

RatingRatingRatingRatingRating
  Average rating: 3.8  (42 votes)
  Discuss this Custom Function

Jan, -
http://users.skynet.be/fa022720/_computing/

Turns text into a value list of its unique words of a given minimum word length, with or without numbers.

Sample Input:
("The Proof 2008, the proof - 2008",0,0,0)

("The Proof 2008, the proof - 2008",4,0,0)

("The Proof 2008, the proof - 2008",0,1,0)

("The Proof 2008, the proof - 2008",0,0,1)


Sample Output:
the¶proof¶2008¶

proof¶2008¶

The¶Proof¶2008¶the¶proof¶

the¶proof¶


 Then copy & paste into FileMaker Advanced's Edit Custom Function window.

Click here to copy To Clip Manager if you have myFMbutler's Clip Manager installed

Description:

QUICK & DIRTY WORD DEDUPLICATION IN 1 STANDALONE FUNCTION

This function recursively parses any text input into a return separated, deduplicated list of its words. Minimum word length and case sensitivity can be specified, as well as how to treat numbers in the text.

Uses: uniqueWords was created as part of a effort to automatically calculate an instant content correlation index between text records. But other obvious uses come to mind: (facilitating) glossary index list creation, automatic keyword list generation etc.

With this updated version, in principle texts of more than 10000 words can now be processed, as long as there are less than 10000 DIFFERENT words. Calculations on texts of more than a few thousands of words causes noticeable stalling on most systems.

DETAILS & HOWTO:

UniqueWords uses recursion to call FileMaker Pro's "substitute" function for every different word and number in the input text. This has 3 important consequences:

1) The higher the amount of DIFFERENT words (and numbers!), the slower the calculation.
2) While it's usable in principle on texts with several tens of thousands of words and numbers, it fails when the amount of DIFFERENT words and numbers approaches or exceeds 10000 - FMPro's recursion limit is 10000 iterations.
3) Deduplication is decent overall but can be incomplete due to limitations in FMPro's "substitute" function - for 100% deduplication replace the "substitute" calls with Peter Wagemans' "substituteCompletely" (refer to its detail page in this site for more info on FMPro's "substitute" limitations). Alternatively, call uniqueWords nested inside a deduplicating sort function for a 100% deduplicated, sorted value list of unique words (search this site with keyword "sort" for several versions).

The function first separates the words of the input text into a return-delimited word list based on the word separators listed in the multiple "substitute" call in the first "Case" block. Your browser may not correctly render all word separator characters in the HTML of this web page. If you run in trouble with that, download the raw text version via http://users.skynet.be/fa022720/_computing/txt-downloads/uniquewords.txt.zip in my site.

Parameter use:

1) Set minimWordLength to f.i. 3 to exclude the often generic or meaningless words of 1 and 2 letters (a, I, it, on, of, up,...). Set it to 0 or 1 to include everything.
2) Set caseSensitive to 1 for returning "Jobs" and "jobs" as 2 different words. If caseSensitive is set to 0, case is ignored in the deduplication and all unique words are returned lowercase.
3) Set shouldSkipNumbers to 1 to filter out unpunctuated, standalone numbers like 118 or 5 (but not 11,8 or 5th or hour notations like 23:59:59); set it to 0 to include them. The function parses date notations like 9-11-2001 into the numbers 9, 11 and 2001 before considering the shouldSkipNumbers setting.

Note: these functions are not guaranteed or supported by BrianDunning.com. Please contact the individual developer with any questions or problems.

This is my Custom Function and I want to edit it

Discuss:

This calc leaves a trailing return. How do I get rid of that?

Mike, Minot
December 07, 2009 11:29pm

That is just what I'm looking for. You've saved me a lot of work for which I'm very grateful!!

Mike, Liverpool, UK
July 09, 2013 6:03am

Make a comment about this Custom Function (please try to keep it brief & to the point). Anyone can post:

Your Name:
City/Location:
Comment:
characters left. If you paste in more than 1500 characters, it will be truncated. Discuss the function - advertisements and other useless posts will be deleted.
Answer 0 + 2 =
Search for Custom Functions:

Custom Functions Widget
Download the Custom Function Dashboard Widget for OS X
Keep all the latest Custom Functions right at your fingertips!

Newest Custom Functions:

1. ProperAllWords ( text )
  (Thu, Dec 14, 5:19pm)
2. MatchExist ( SourceTable.Field ; DestTable.Field ; ReturnField )
  (Thu, Dec 14, 1:07pm)
3. BVR_Format ( bvr )
  (Wed, Dec 13, 5:32pm)
4. NumberToHexadecimal(NumberValue)
  (Fri, Dec 08, 8:54am)
5. Get_BaseTable
  (Thu, Dec 07, 4:27pm)
6. JSONCreateVarsFromKeys ( json ; namespace )
  (Wed, Dec 06, 8:21pm)
7. GetTableNzme ( field )
  (Tue, Dec 05, 9:16pm)
8. TimeFormatAsText ( theTime )
  (Mon, Nov 13, 1:59pm)

RSS Feed of Custom Functions