Search for long phrases

For all things Mellel

Moderators: Eyal Redler, redlers, Ori Redler

Post Reply
Vaissiere
Read the guide!
Posts: 39
Joined: Sun Dec 26, 2010 10:46 am
Location: Paris

Search for long phrases

Post by Vaissiere »

Dear all
although I am not Proust, I am asked by my editor to cut my long French phrases so that they are not more than 4 lines long. However I am puzzled how to find these long phrases (and only them) using Mellel's search engine. Basically I should search something like "all set of characters more than 300 characters long and ending with a dot (or ? or ! or …)". I have tried to modify the search "any sentence" in the list of search actions but with no success.
Would you have any idea ? I crashed mellel several times in doing so… 
More generally a repository of such Search actions would be useful

yours

Etienne
Eyal Redler
Co-founder
Posts: 692
Joined: Thu Oct 27, 2005 9:15 am

Re: Search for long phrases

Post by Eyal Redler »

Here's how you can do this

1. Enter the sentence terminators (period, question mark etc) to the find field, separated by an "or" element (choose "OR" in the little triangle menu)
2. Select the terminators and group them (choose "Group" from the triangle menu)
3. Place the insertion point before the group and insert an "Any Character" element (again, choose from the triangle menu)
4. Double click the "Any" element and choose "Once or more" in the "Repetition" popup menu, also click the "Greedy" checkbox.

Here's a download link to a find set containing this action:

https://s3.amazonaws.com/mellel.other/F ... tences.pfs
Eyal Redler
----------------------
Co-Founder and Owner at Mellel
Facebook: http://www.facebook.com/mellelwordprocessor
YouTube: http://www.youtube.com/user/MellelRedlex
Donate: https://www.paypal.com/donate/?hosted_b ... 2LWB33YBZW
Icelander
Knows everything, can prove it
Posts: 366
Joined: Mon Aug 18, 2014 10:59 pm

Re: Search for long phrases

Post by Icelander »

Eyal Redler wrote: Wed Apr 05, 2023 11:19 am Here's how you can do this
I'm afraid this doesn't solve the problem. The original poster only wants to find sentences which contain 300 characters or more. Sentences containing less characters should not be found.

In theory, search criteria, like the following, seem to be the way to go:
"Any Character" with Repetition: 'This much or more' and Min. 300.

But…, unfortunately that doesn't work either.

This brings me to regex (regular expression). Under the hood, Mellel's search engine uses regex, if I'm not mistaken, and I remember you saying that regex improvement is planned for Mellel. Currently it's impossible to enter regex directly into the Find & Replace box in Mellel. If that were possible, we could easily enter the find criteria into a regex-savvy program, for example Nisus, and then simply paste them back into Mellel.

In our particular case, this is the solution:
/Start of Sentence/AnyCharacter/CharacterNotInSet[.!…]/300-2000 Times/Capture(.OR!OR…)

This means, we look for sentences which consist of at least 300, but maximal 2000 characters, and all sentences with less than 300 characters are ignored.

When the search criterion is pasted into Mellel it is automatically converted to:
(?<=^|\.\s|\.\u201D\s|\."\s|\!\s|\!\u201D\s|\!"\s|\?\s|\?\u201D\s|\?"\s)\X[^.]{300,2000}(.|!|…)

This is a valid regex expression, but Mellel can't cope with it.

And this brings me to the find option /Start of Sentence/:
If I'm not totally mistaken, there exists no /Start of Sentence/ search option in Mellel, only /Start of Paragraph/.
Could a /Start of Sentence/ be implemented? Consider this as a feature request.

And lastly:
You recently changed the Mellel website and asked us to provide some feedback if we "see any […] glaring omissions."

YES, there is one glaring omission: there is no way to post screenshots !!!!!!!!!
Amontillado
Knows everything, can prove it
Posts: 148
Joined: Fri May 04, 2018 4:00 am

Re: Search for long phrases

Post by Amontillado »

I think the problem is with repetition for any character. There seems to be a workaround, though.

If you want to find 300 or more repetitions of any character, click the "any character" glyph to highlight it. Use the triangle menu to group it.

A group of just one thing, "any character."

Then, double click the group brace. Make sure the group is highlighted, not the any character glyph.

Choose greedy, "this much or more," and enter 300 in the Min field.

I haven't done much testing, so please see if putting "any character" in a group by itself gets the repetition you need.

Apologies for inexact information. I've been terribly busy with affairs of the high seas, which is to say I've been at a tall ships festival. If I break out into sea chanties, it will pass soon.
Icelander
Knows everything, can prove it
Posts: 366
Joined: Mon Aug 18, 2014 10:59 pm

Re: Search for long phrases

Post by Icelander »

Hello Amontillado,

I'm afraid this method doesn't make any difference either.

To get a correct result, make sure your test document consists of multiple *sentences* of different length, not of (single) paragraphs. For test purposes it's best to have only one paragraph, that is, many sentences in one paragraph.

Under those conditions, if you position the insertion point at the top of the document and then execute your find command, the whole document will be selected. :–)

Am I right?
Amontillado
Knows everything, can prove it
Posts: 148
Joined: Fri May 04, 2018 4:00 am

Re: Search for long phrases

Post by Amontillado »

Well, drat. I missed the target. How about this?

For simplicity's sake, let's start with sentences ending with a period.

Enter a dot in the Find window. That's a dot character, not a regex single character wildcard.

Double click it. Once the dot is highlighted, choose "group" from the triangle method, making it a group of one thing, a period.

Double click either opening or closing bracket for the group, choose "this much or more", enter the desired count in the Min field, and select the options for "greedy" and "negated."

I believe that will find sentences longer than the threshold.

My sample text, some drivel written by a talentless hack – it looked like something I wrote myself last week – had a sentence ending in a quoted question, terminated with a question mark followed by quote.

To catch that one, I changed the pattern from "dot in a group by itself" to dot OR question mark followed by a quote mark.

Then there's the noble interrobang, which is ?! as a sentence terminator. I changed the negated group to dot or question-quote or question-exclamation.

Greedy causes Find to select up to the last character that is not dot or ?" or interrobang, since the group finds characters not matching the pattern in the group, 300 or more times.

Did I hit the target?
Icelander
Knows everything, can prove it
Posts: 366
Joined: Mon Aug 18, 2014 10:59 pm

Re: Search for long phrases

Post by Icelander »

Congratulation! I think. you have found the solution.

If I understand this search logic correctly, then we search for a dot 300 times, but since we have enabled the option "Negated" dots are ignored 300 times. Does that make sense?

The following setting seems to work too, albeit with a minor inaccuracy:
.Character Set
[This much or more, Min 300]
Greedy
Negated
Set .

If I were the original poster, I would also insert "Found Expression" into the Replace field from the Replace menu and color it with red, or any other color that stands out. 'Replace All' will then color all instances of sentences longer than 300 characters in one go and that will make editing a long document much easier.
Amontillado
Knows everything, can prove it
Posts: 148
Joined: Fri May 04, 2018 4:00 am

Re: Search for long phrases

Post by Amontillado »

That's a nice touch, highlighting the found results.

The negated search means "search for anything that doesn't match the pattern." It's looking for 300 or more things that don't match.

I'm not sure if the original problem is a bug, or a consequence of some other logic of searching.
Eyal Redler
Co-founder
Posts: 692
Joined: Thu Oct 27, 2005 9:15 am

Re: Search for long phrases

Post by Eyal Redler »

Icelander wrote: Sat Apr 15, 2023 4:21 pm
Eyal Redler wrote: Wed Apr 05, 2023 11:19 am Here's how you can do this
I'm afraid this doesn't solve the problem. The original poster only wants to find sentences which contain 300 characters or more. Sentences containing less characters should not be found.

In theory, search criteria, like the following, seem to be the way to go:
"Any Character" with Repetition: 'This much or more' and Min. 300.

But…, unfortunately that doesn't work either.
I missed the 300 word part but modifying it like you described worked just fine for me.
If you take the expression I linked to, double-click the "any character" element and set it to "this much or more" and the value to 300, it will find only paragraphs that are longer than 300 characters.
Eyal Redler
----------------------
Co-Founder and Owner at Mellel
Facebook: http://www.facebook.com/mellelwordprocessor
YouTube: http://www.youtube.com/user/MellelRedlex
Donate: https://www.paypal.com/donate/?hosted_b ... 2LWB33YBZW
Vaissiere
Read the guide!
Posts: 39
Joined: Sun Dec 26, 2010 10:46 am
Location: Paris

Re: Search for long phrases

Post by Vaissiere »

Thanks a lot for your answers. Yes, after many mistakes, I thought to the "negated" feature and it worked. I am happy to see that I was not the only one to solve the problem

Yours

Etienne
Post Reply