New Mars Forums

Official discussion forum of The Mars Society and MarsNews.com

You are not logged in.

Announcement

Announcement: As a reader of NewMars forum, we have opportunities for you to assist with technical discussions in several initiatives underway. NewMars needs volunteers with appropriate education, skills, talent, motivation and generosity of spirit as a highly valued member. Write to newmarsmember * gmail.com to tell us about your ability's to help contribute to NewMars and become a registered member.

#2451 2022-08-23 14:44:39

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Email Outreach campaign ....

A curiosity of the FluxBB email subsystem is that it provides no feedback on whether it worked or not.

I stopped in at FluxBB.org to see if anyone had ever discussed the mail subsystem, and I found one topic.

That topic included a link to a github repository, and there I found this:

$mail_to        = $recipientname." <".$recipientemail.">";
        $mail_subject   = pun_htmlspecialchars($_POST['message_subject']);
        $mail_message   = pun_htmlspecialchars($_POST['message_body']);

        pun_mail($mail_to, $mail_subject, $mail_message);

From this I deduce that we could see the pun_mail code if we re-connect to the forum source code.

My guess is that the procedure .... I stopped because I have ** no ** idea what it's doing .... if it launches the resident email program, that could be anything.

If it has it's own email program, that would have to work within the operating system.

Interesting!

(th)

Offline

#2452 2022-08-24 05:53:15

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Pothole Scan ...

Completed Sequence for ID: 83000


Total Command Lines found: 26
Total input Lines in script: 183

  Number of ID's processed: 1000

Starting Number: 82001

Last Number of Run: 83000

Summary for Web Automation Report for 08-24-2022 at 07:49:00
Average time of Loop from Main form: 00:00:20
2 Skip Exceptions were recorded.

Total time of Processing: 11:28:17

Total time Program was Active: 11:31:48

Potholes: 82143 82018
Edits: Zero

This evening's scan will try for 84000

(th)

Offline

#2453 2022-08-24 18:13:54

SpaceNut
Administrator
From: New Hampshire
Registered: 2004-07-22
Posts: 28,750

Re: Housekeeping

I have not received any other emails via the emailer of the forum as of today but did the previous but as you indicated the time out might have cause it not to be performed.

Offline

#2454 2022-08-24 18:44:41

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re 2453

Thanks for your report on NOT receiving confirmation emails.

I am ** definitely ** getting the impression the email service is unreliable.  In the days ahead, my plan is to set up a run as follows:

1) My test account
2) The candidate
3) SpaceNut
4) My test account

It is possible that on a given day, only the first email goes out.

it that is the case, then I'll change the sequence so the candidate is first, and the confirmation emails won't matter.

I ** know ** that multiple emails can work, because RobertDyck complained about receiving one.

I'll have to go back to see if his username was first in the list on the day it was sent.

As another experiment, I could stack 10 transmissions to my test account.

It just crossed my mind that I could add the time of day of the transmission, in order to find out which messages got sent, assuming ** any ** get sent!

(th)

Offline

#2455 2022-08-25 06:47:10

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Potholes report...

Three potholes were reported:
83340 83213 83203

Tomorrow's run will try for 85000 - goal is 199396 so should finish in 2022

***
Good to see Mars_B4_Moon back after a break!

(th)

Offline

#2456 2022-08-26 06:08:55

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Potholes report...

Completed Sequence for ID: 85000


Total Command Lines found: 26
Total input Lines in script: 183

  Number of ID's processed: 1000

Starting Number: 84001

Last Number of Run: 85000

Summary for Web Automation Report for 08-26-2022 at 07:53:23
Average time of Loop from Main form: 00:00:20
6 Skip Exceptions were recorded.

Total time of Processing: 12:41:53

Total time Program was Active: 12:45:15

Potholes found: (6) 84352 84207 84157 84107 84106 84052

Tomorrow's Pothole scan will try for 86000 - today's high water mark is 299510

Update at 10:26 local time ... both confirmation emails arrived this morning. You should have received one.

(th)

Offline

#2457 2022-08-27 05:39:37

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Pothole Scan...

WBA reached 86000 ...

40 skips were recorded ... analysis to follow
SK:  85084 Bad request
SK:  85103 Bad request
SK:  85153 Bad request
SK:  85202 Bad request
SK:  85203 Bad request
SK:  85240 Bad request
SK:  85247 Bad request
SK:  85248 Bad request
SK:  85249 Bad request
SK:  85251 Bad request
SK:  85252 Bad request
SK:  85254 Bad request
SK:  85257 Bad request
SK:  85259 Bad request
SK:  85260 Bad request
SK:  85261 Bad request
SK:  85264 Bad request
SK:  85265 Bad request
SK:  85272 Bad request
SK:  85274 Bad request
SK:  85289 Bad request
SK:  85298 Bad request
SK:  85300 Bad request
SK:  85379 Bad request
SK:  85380 Bad request
SK:  85406 Bad request
SK:  85411 Bad request
SK:  85412 Bad request
SK:  85413 Bad request
SK:  85414 Bad request
SK:  85465 Bad request
SK:  85466 Bad request
SK:  85467 Bad request
SK:  85469 Bad request
SK:  85470 Bad request
SK:  85483 Bad request
SK:  85502 Bad request
SK:  85527 Bad request
SK:  85532 Bad request
SK:  85566 Bad request

Seems like a lot ...

Tentative plans for today...
1) Test script to convert zero post ID's to TestID's
2) Send one email
3) Advance code for grid capture
4) Pothole scan to 87000

(th)

Offline

#2458 2022-08-27 10:35:27

SpaceNut
Administrator
From: New Hampshire
Registered: 2004-07-22
Posts: 28,750

Re: Housekeeping

It looks like we are closing in on the first fail attack area...

Offline

#2459 2022-08-27 10:42:40

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re #2458

Thanks for your historical perspective!

*** FYI ... It's been a ** long ** time since I ran the TestID script.  I found an old copy and tried it on the set of usernames with Zero Posts that you provided.

It took quite a while to remember how to use the script, which requires an adjustment to the setup screen I had forgotten.

In addition, the script had a tab in a location where it is no longer needed.  I have no idea what happened, but removing that tab allowed the script to run properly.

We should have 31 more TestID's in the banned list when the script finishes today.

Our Outreach emails seem to have gone out.

(th)

Offline

#2460 2022-08-27 12:29:50

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re users with Zero posts

Please double check ... I just ran a search for users with zero posts, and all I found are TestID's

It could he that the list you gave me to convert was the last of them, and I converted that group.

However, there may be others that I can't see for some reason.

Just FYI ... I don't expect to need the TestID script again (if there are no users with zero posts)

However, just in case, I updated the script to perform a mouse click on the Bans page. This will avoid the tab.

(th)

Offline

#2461 2022-08-28 07:35:57

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Pothole Scan ....

Reminder: Please double check ... I ** think ** we have converted all Zero Post usernames, but would appreciate your confirming those usernames are now gone.

Completed Sequence for ID: 87000


Total Command Lines found: 26
Total input Lines in script: 183

  Number of ID's processed: 1000

Starting Number: 86001

Last Number of Run: 87000

Summary for Web Automation Report for 08-28-2022 at 09:21:24
Average time of Loop from Main form: 00:00:20
90 Skip Exceptions were recorded.

Total time of Processing: 13:20:09

Total time Program was Active: 13:29:40

Skip analysis for 90 Skips:
SK: Skipping  86015
SK: Skipping  86020
SK: Skipping  86028
SK: Skipping  86029
SK: Skipping  86034
SK: Skipping  86054
SK: Skipping  86055
SK: Skipping  86056
SK: Skipping  86058
SK: Skipping  86066
SK: Skipping  86080
SK: Skipping  86083
SK: Skipping  86084
SK: Skipping  86088
SK: Skipping  86091
SK: Skipping  86092
SK: Skipping  86126
SK: Skipping  86128
SK: Skipping  86132
SK: Skipping  86136
SK: Skipping  86138
SK: Skipping  86139
SK: Skipping  86143
SK: Skipping  86145
SK: Skipping  86149
SK: Skipping  86153
SK: Skipping  86164
SK: Skipping  86170
SK: Skipping  86171
SK: Skipping  86172
SK: Skipping  86185
SK: Skipping  86186
SK: Skipping  86191
SK: Skipping  86192
SK: Skipping  86196
SK: Skipping  86205
SK: Skipping  86212
SK: Skipping  86221
SK: Skipping  86223
SK: Skipping  86234
SK: Skipping  86240
SK: Skipping  86261
SK: Skipping  86267
SK: Skipping  86268
SK: Skipping  86308
SK: Skipping  86309
SK: Skipping  86318
SK: Skipping  86338
SK: Skipping  86342
SK: Skipping  86343
SK: Skipping  86346
SK: Skipping  86347
SK: Skipping  86349
SK: Skipping  86351
SK: Skipping  86352
SK: Skipping  86354
SK: Skipping  86355
SK: Skipping  86356
SK: Skipping  86357
SK: Skipping  86358
SK: Skipping  86360
SK: Skipping  86437
SK: Skipping  86446
SK: Skipping  86451
SK: Skipping  86458
SK: Skipping  86463
SK: Skipping  86467
SK: Skipping  86469
SK: Skipping  86470
SK: Skipping  86471
SK: Skipping  86472
SK: Skipping  86473
SK: Skipping  86474
SK: Skipping  86477
SK: Skipping  86478
SK: Skipping  86480
SK: Skipping  86481
SK: Skipping  86482
SK: Skipping  86483
SK: Skipping  86484
SK: Skipping  86485
SK: Skipping  86487
SK: Skipping  86496
SK: Skipping  86497
SK: Skipping  86498
SK: Skipping  86523
SK: Skipping  86528
SK: Skipping  86538
SK: Skipping  86540
SK: Skipping  86563

Boy! I hope things start returning to "normal" soon!

***
I ** really ** appreciate Mars_B4_Moon bringing those old topics back to life, and thus showing the benefits of your months long campaign to make them readable.

I'll launch the email outreach shortly.

Tonight's Pothole scan will try for 88000.

(th)

Offline

#2462 2022-08-29 06:36:13

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Pothole Scan ...

We did a bit better in the latest run ... there were only 14 potholes

14 Skip Exceptions were recorded.

SK: Skipping ID: 87072
SK: Skipping ID: 87128
SK: Skipping ID: 87206
SK: Skipping ID: 87260
SK: Skipping ID: 87296
SK: Skipping ID: 87325
SK: Skipping ID: 87334
SK: Skipping ID: 87352
SK: Skipping ID: 87371
SK: Skipping ID: 87372
SK: Skipping ID: 87373
SK: Skipping ID: 87380
SK: Skipping ID: 87383
SK: Skipping ID: 87384

Potholes can occur "naturally" if a member deletes a post.
In addition, an administrator can "move" a post from one topic to another.
A "Move" involves creating a new copy in another topic, and deleting the original.
However, these events ** should ** be rare, so I would expect a "normal" report to be zero potholes.

Tonight's run will try for 89000

We have a candidate lined up for today's email outreach.

(th)

Offline

#2463 2022-08-30 06:18:50

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Potholes Scan ....

59 was the count this time ...

Completed Sequence for ID: 89000


Total Command Lines found: 26
Total input Lines in script: 183

  Number of ID's processed: 1000

Starting Number: 88001

Last Number of Run: 89000

Summary for Web Automation Report for 08-30-2022 at 07:50:19
Average time of Loop from Main form: 00:00:20
59 Skip Exceptions were recorded.

Total time of Processing: 12:13:39

Total time Program was Active: 12:16:17

Skip analysis for 59 potholes:
    Post    ID:    88035
    Post    ID:    88036
    Post    ID:    88037
    Post    ID:    88046
    Post    ID:    88048
    Post    ID:    88050
    Post    ID:    88077
    Post    ID:    88085
    Post    ID:    88086
    Post    ID:    88087
    Post    ID:    88100
    Post    ID:    88101
    Post    ID:    88127
    Post    ID:    88128
    Post    ID:    88129
    Post    ID:    88147
    Post    ID:    88148
    Post    ID:    88149
    Post    ID:    88183
    Post    ID:    88184
    Post    ID:    88185
    Post    ID:    88198
    Post    ID:    88200
    Post    ID:    88202
    Post    ID:    88206
    Post    ID:    88230
    Post    ID:    88231
    Post    ID:    88232
    Post    ID:    88233
    Post    ID:    88234
    Post    ID:    88238
    Post    ID:    88269
    Post    ID:    88270
    Post    ID:    88283
    Post    ID:    88326
    Post    ID:    88328
    Post    ID:    88329
    Post    ID:    88339
    Post    ID:    88345
    Post    ID:    88346
    Post    ID:    88347
    Post    ID:    88378
    Post    ID:    88379
    Post    ID:    88384
    Post    ID:    88388
    Post    ID:    88430
    Post    ID:    88436
    Post    ID:    88465
    Post    ID:    88466
    Post    ID:    88488
    Post    ID:    88489
    Post    ID:    88528
    Post    ID:    88529
    Post    ID:    88530
    Post    ID:    88536
    Post    ID:    88537
    Post    ID:    88538
    Post    ID:    88544
    Post    ID:    88545

Tonight's run will try for 90,000 ... today's post ID is 199700 - 110 days estimated to finish

Per Google: Today's date is 29-Aug-2022 (UTC). Today's Julian Date is 22241 .

241+110 >> 351 ... Projected finish is in mid-December

*** Inside baseball section...

During the final days of the Post Repair initiative, I began to explore the possibility of "reading" the browser display like a human, by "looking" at pixels.

The first stage of exploration went fairly smoothly.  A test program is able capture pixel properties from a section of screen as specified by the operator.

The resulting file shows the pixel location and it's properties for all the pixels requested.

The ** next ** stage of exploration is taking a bit longer ....

It is possible for a spreadsheet program to show a magnified version of the captured grid, but getting from here to there is a bit more involved than I realized.

Today I finished coding and testing the first of four phases of processing needed to deliver an XML file that can be imported to a spreadsheet.

The steps are:
1) Open header XML, show it to operator, and copy to output if approved <done>
2) Open model XML for rows, show it to operator, and copy it to memory for the next step
3) Open capture data, show it to the operator, and deliver XML for each cell if approved
4) Open trailer XML, show it to operator, and copy it to output if approved.

At the rate I'm going, finish is about a week out.  Fortunately, there is no hurry, because the email and pothole scripts are running fine without it.

However, ** you ** may come up with something you need for which this would be useful, so I'm planning to finish it.

(th)

Offline

#2464 2022-08-30 18:49:06

SpaceNut
Administrator
From: New Hampshire
Registered: 2004-07-22
Posts: 28,750

Re: Housekeeping

by looking at the next post count it appears that we are now in 2006....

Offline

#2465 2022-08-31 05:41:37

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Pothole Scan ...

We show only 14 potholes in last night's run to 90000.  That is an improvement, for sure!

The average time-per-post is 20 seconds.  Tonight I'll try a run to 92000.  It will take just under 12 hours, so the laptop will be busy all night, if all goes well.

***
Looking ahead to December ... on the 28th (more or less) Mars will complete another orbit.  We've been observing the daily movements of Mars since 2019/01/20, and we observed completion the of Year 35 on February 9, 2021.  Mars is in it's final quarter, and we'll be observing completion of Year 36.
It seems to me the Proposed Business Calendar for Mars has held up well, over the past several Earth years. At some point, I am hoping this proposed calendar will be considered for adoption by the Mars Society, so that it can be offered for sale as a fund raising method for the non-profit goals of the Society.  It is late but not TOO late to begin planning for printing of a calendar to cover Year 37.  That calendar would simultaneously cover two Earth years: 2023 and 2024.
The end of Mars Year 37 would occur near November 14, 2024.  There's a great deal of Mars artwork/photograph images that could be added to a Mars Calender offering by the Mars Society.  The calendar could be offered in both print and digital form.
***
In the Simple OCR study, my hope is to complete two more of the five phases of the XML writer feature.
Update at 12:07 local time. Modest progress ... a step in proceeding to phase two of file is opening the input again.
I decided to deal with this issue by expanding the versatility of the simple program to allow for successive input file opens.

For Email Outreach, I'll review at least one candidate, and perhaps more, depending upon how it goes.
Update at 12:07 ... I found a candidate and sent an email. You should be receiving your copy.
This candidate is from 2003, so it is ** highly ** unlikely we will hear back.

Thanks for helping with the fire discussion.  The best preventive measure would ** seem ** to be avoiding combustible materials as much as possible.  I note that the NASA high school project reported on by the North Houston chapter of NSS specifically includes fire analysis of materials in the scrubbing process.

The North Houston chapter recently featured a speaker who's been helping with the high school outreach campaign, and he reported on a successful development of a folding table that was eventually put into service on the ISS.

If you're interested in seeing the talk, it is saved as a YouTube video, via a link from northhoustonspace.org

Skip analysis for 14 skips:
SK:  Post ID: 89059
SK:  Post ID: 89114
SK:  Post ID: 89216
SK:  Post ID: 89238
SK:  Post ID: 89240
SK:  Post ID: 89241
SK:  Post ID: 89301
SK:  Post ID: 89330
SK:  Post ID: 89362
SK:  Post ID: 89400
SK:  Post ID: 89415
SK:  Post ID: 89504
SK:  Post ID: 89533
SK:  Post ID: 89534

Update at 20:22 local time .... a Pothole Scan over 2000 posts is under way ... it should finish before 8 AM tomorrow.

The process slows down as data accumulates in the browser history, so I'm not sure if the performance will remain at 20 seconds per post.

(th)

Offline

#2466 2022-09-01 06:38:54

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Pothole Scan ...

The run that finished this morning was set for 2000 posts....

It reached the finish line with a Green Screen and ** only ** 20 potholes...

However, it caused Google to notice... I've never seen this message before, but 12 hours was long enough to cause it:

About this page

Our systems have detected unusual traffic from your computer network. This page checks to see if it's really you sending the requests, and not a robot. Why did this happen?

IP address: (this IP address)
Time: 2022-09-01T11:11:12Z
URL: https://www.google.com/search?q=92000&r … e&ie=UTF-8

What's curious about this is that it makes reference to 92000, which was the closing point for the run.

I'll set tonight's run for 2000 again, and it will be interesting to see if the robot detection algorithm shows up again.

The run that just ended gave these results:

Completed Sequence for ID: 92000


Total Command Lines found: 26
Total input Lines in script: 183

  Number of ID's processed: 2000

Starting Number: 90001

Last Number of Run: 92000

Summary for Web Automation Report for 09-01-2022 at 08:19:17
Average time of Loop from Main form: 00:00:20
20 Skip Exceptions were recorded.

Total time of Processing: 12:00:32

Total time Program was Active: 12:04:08

  Lines containing String One 34 << SpaceNut posts
  Lines containing String Two 2,000 posts in run
Lines containing String Three 60 << potholes / 3 >> 20
Lines containing String Four 40  << potholes / 2 >> 20

SK: ID: 90006
SK: ID: 90007
SK: ID: 90031
SK: ID: 90117
SK: ID: 90147
SK: ID: 90168
SK: ID: 90174
SK: ID: 90182
SK: ID: 90222
SK: ID: 90280
SK: ID: 90297
SK: ID: 90338
SK: ID: 90340
SK: ID: 90374
SK: ID: 90412
SK: ID: 90424
SK: ID: 90441
SK: ID: 90454
SK: ID: 90463
SK: ID: 90464

In review of the report of potholes above, I note that the last detection was 90464, and the next 1536 were problem free.  A few days back you recalled a rough patch during transition to FluxBB, so I'm wondering if tonight's run might be clean as a whistle. 

Work will continue today on Simple OCR ... 2 of 5 phases of XML write are finished. Phase 3 is analysis of the Row Model, and it should go fairly smoothly.  Phase 4 is analysis and processing of the captured pixel data. Phase 5 is appending of the trailer XML and it should go quickly. 

I still think it will take several days to reach the near term goal, given the (small) amount of time allocated each day.

Update at 15:17 local time ... For the Simple OCR project, today was two steps back (with recovery) and no steps forward.

The procedure involves opening four files in sequence, and there were some loose ends (ie, loops) in the first two phases.  Hopefully those have been cleaned up so work can resume on input of the pixel data tomorrow.

(th)

Offline

#2467 2022-09-02 06:00:11

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For  SpaceNut

Gioogle noticed the 12 hour run again:

About this page

Our systems have detected unusual traffic from your computer network. This page checks to see if it's really you sending the requests, and not a robot. Why did this happen?

IP address: (this one)
Time: 2022-09-02T11:22:59Z
URL: https://www.google.com/search?q=94000&r … e&ie=UTF-8

This time I decided to try the reCAPTCHA option next to "Is this really you" ... It called a combine a tractor, which indicates to me that some employee of reCAPTCHA miscategorized the combine.  That would have caused confusion for a lot of humans, as it did for me, but the reCAPTCHA algorithm was satisfied when I clicked on the combine.

The run itself seemed to go well...

Completed Sequence for ID: 94000


Total Command Lines found: 26
Total input Lines in script: 183

  Number of ID's processed: 2000

Starting Number: 92001

Last Number of Run: 94000

Summary for Web Automation Report for 09-02-2022 at 08:04:26
Average time of Loop from Main form: 00:00:20
21 Skip Exceptions were recorded.

Total time of Processing: 11:33:41

Total time Program was Active: 11:36:28

Only 21 potholes for 2000 posts seems like a reasonable number, considering what we've seen recently.

Still, I would expect a count of zero.

Investingating further ... it appears that reCAPTCHA started well before the end.

The first occurance of the word is at 93149.

However, the program was misbehaving before then.

Working backward ... the program was off the rails at 92974

It looks as though the last "good" post processing occurred at 92433

A series of potholes showed up around 92439

A "good" post was processed at 92449

OK ... now I'm thinking it will be necessary to look more carefully at the logs of the Pothole Scans to see if they have gone off the rails in a similar manner.

A new (to me for sure) failure mode has shown up ... The post does not generate a "bad request" but no content is delivered ...

OK ... enough analysis for now .... I'll suspend pothole processing until I understand the problem.

Update at 19:35 local time .... preliminary analysis seems to reveal a problem with how the script handles the transition from submitting a request to edit a post, and processing of the delivered data.

In a nutshell, i suspect a simple tab is used to move from the address bar to the edit window, and if that tab does not take effect due to Internet congestion, then the subsequent processing is not well grounded.

Before running any more scans, i'll study the problem more carefully, and try to find a more reliable solution.

Please note that both BeerMan and GW Johnson report having received acceptance messages from the Conference committee.

(th)

Offline

#2468 2022-09-03 06:27:44

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Suspension of Scans....

The excursion into Google's reCAPTCHA territory reported recently have revealed new failure modes.

I suspect they were always present as possibilities, but the slow pace of the previous runs, and the 300 post limit per day, may have prevented putting stress on the system that caused them to show up.  Going to 2000 posts in 12 hours ** definitely ** stressed the system to the breaking point.

The first confirmed defect that I found in log analysis yesterday is a failure of the simple TAB that moves from the Address Bar to the display page of the browser.

Under stress, the log shows clearly that the tab may fail to perform it's needed action, so that subsequent operations are performed on the Address Bar instead of the display page.

The new Simple OCR that's been in development recently might be able to deal with that, or at least detect the event.

In any case, I've decided to stop the pothole scans for now.

The main benefit of continuing the pothole scans was to exercise the MySQL database in a systematic way.  I think that is still worth doing, but it is not a priority.

On the ** other ** hand, the new Email Outreach campaign will (or promises to) go on for years, at the rate of only one person per day.

***

Did you notice that Mars_B4_Moon and friends put the daily active report over the two full page mark?

When I looked for something yesterday evening, I found it in the middle of the third page of results.

It was ** really ** neat to find noosfractal (from 2007) back in view thanks to Mars_B4_Moon updating the Enceladus topic!

(th)

Offline

#2469 2022-09-03 12:10:13

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Pothole Scans ...

I've decided to gather more information before attempting to solve whatever the problem may be that showed up recently, when the FluxBB system was put into a stress test by attempting 2000 posts in a 12 hour session.

I ** did ** add a one second wait before the Tab command that I discovered had failed on at least one occasion.

To gather information, I've added several report commands to the Potholes script.

I'll restart at 78001 and keep the runs to 1000.

The challenge I am facing is how to determine "normal" behavior of the script, when "normal" is characterized by random text written by NewMars members decades ago, and Departure from Normal could be other random characters generated by whatever Internet misbehavior may occur.

An Artificial Intelligence would be able to distinguish actual posted text from the variety of alternatives that show up when things go off the rails.

Unfortunately, this script is a long l - o - n - g way from any kind of intelligence.

(th)

Offline

#2470 2022-09-03 13:25:27

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Simple OCR mini-project ...

In study of the behavior of the WBA program, interacting with a browser which is itself interacting with FluxBB, I have concluded that "Edit Post" is a set of pixels that show up consistently when things are going well, and which does NOT show up when any of a myriad of errors occur.

I took a screen capture of the Edit Post section of the view, and then cropped it to 83x20 pixels.

The total number of pixels is given (by GIMP) as 1600.   That is a manageable number of pixels to scan.

83x20 is 1660 so GIMP may have rounded down to the nearest whole number.

The actual number needed may be far fewer, because mismatches will occur quickly if the image of Edit Post is compared to the random pixels that show up when any of a myriad of errors occur.

I'll try to store the png on imgur.com

bgMoquW.png

bgMoquW.png

OK ... so ** that's ** the focus of the current development cycle.


(th)

Offline

#2471 2022-09-03 18:21:39

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Pothole Scan rerun ...

I reran the scan from 78001 to 79000 ... 2 skips were recorded and they are likely to be potholes.

However, due to recent failures encountered during long runs of this script, I'm planning to review it for other faults that may have occurred.

Completed Sequence for ID: 79000


Total Command Lines found: 31
Total input Lines in script: 195

  Number of ID's processed: 1000

Starting Number: 78001

Last Number of Run: 79000

Summary for Web Automation Report for 09-03-2022 at 20:13:32
Average time of Loop from Main form: 00:00:21
2 Skip Exceptions were recorded.

Total time of Processing: 05:59:09

Three posts were reported with problems - I'll look at them shortly

Match for: Edit completed for: newmars.com/forums/edit Count deleted: 1 Count is: 67
String 2: Completed Sequence for ID: 78511

Match for: Edit completed for: newmars.com/forums/edit Count deleted: 1 Count is: 82
String 2: Completed Sequence for ID: 78625

Match for: Edit completed for: newmars.com/forums/edit Count deleted: 1 Count is: 83
String 2: Completed Sequence for ID: 78626

All three posts came up fine when tested manually.

The sequence from 78001-79000 has no potholes or other artifacts.

I'll set up a run over the series from 79001-80000 using the new script.

As a side note ... there were 80 SpaceNut posts reported but (of course) those are not reported here.

(th)

Offline

#2472 2022-09-03 22:31:16

SpaceNut
Administrator
From: New Hampshire
Registered: 2004-07-22
Posts: 28,750

Re: Housekeeping

Sounds like the detection of a denial of services code set the activity alert.

I agree with running the sequential data locations even if all it shows are potholes.

Yes, activity for Newmars is at an all-time high and I wish that others would not look at it as old topic churn but making topics relevant and not just making topics for the sake of seeing the key word in the title....

sometimes the posts are not quite on topic, but I would rather have people reading the links and then seeing discussion rather than no posts or discussion.

Offline

#2473 2022-09-04 07:01:02

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut ....

Your comments about topic titles are interesting.

Since you are the Senior Administrator, you have an influence over the flow of activity.

***
The change in approach to running the scans is producing interesting results ...

Last night's run covered 79000-80000

There were 112 posts reported

Of those, 55 were SpaceNut posts

That left 57 which appear to be good posts by regular members

The most logical/reasonable explanation is that Internet congestion is interfering with the Tab that is supposed to move the focus from the Address Bar to the Edit window.

There may not be any potholes at all.

The Simple OCR concept I've been working on may be able to solve the problem, by looking for [Edit Post] in the pixels, before proceeding to process (ie, look at) a post.

Overnight, I realized the presentation density on a device is going to influence the usefulness of the method.

I'm ** pretty ** sure the screens in use for lookup, development and automation all use 1024x768.

However, I need to verify that before putting the new Simple OCR into service.

In the mean time (since the development will take several more days) I'm thinking of making a script to check the ID's that are flagged by the Potholes script.

57 ID's is too many to check by hand.

Update at 10:05 ....

I ran a special Pothole script to check ID's from a file. The run showed that all 57 but 1 were good member posts.
There was one SpaceNut post missed in the longer run.

I added a mouse click before the tab, to try to prevent future incidents involving the tab.

Update at 13:20 local time ...

In a recent post I reported on the challenge/opportunity of dealing with differing computer monitor display densities...

Because my initial capture of the bit map of [Edit Post] was taken on a higher density monitor than 1024x768, I decided to capture the same image on the laptop, which is set to 1024x768.  To my surprise the resulting capture was almost exactly the same size... 63 pixels long and 13 pixels high.

It is possible that the browser accepts a request from FluxBB and adapts it to the local situation.

In any case, it would appear that a pattern match to confirm [Edit Post] is present on the screen would involve on the order of 1600 (or so) pixels.

It is even possible a smaller number could be checked to have a strong indication the desired pixels are present.

However, there ** is ** another wrinkle ... in watching the WBA program performing Pothole checks today, I noticed that the location of [Edit Post] varies vertically, depending upon the number of words in the title of the index level topic for the post in question.

That means that the program intending to "look" for [Edit Post] pixels is going to need to look in more than one place.

However, ** that's ** a challenge for some future day.

The status of the Simple OCR inquiry is holding at two phases complete out of five needed.

(th)

Offline

#2474 2022-09-05 07:52:17

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Re-Scan of Potholes ...

Last night's run reached 81000

Two skips were reported but analysis is needed before we know what those were

Completed Sequence for ID: 81000


Total Command Lines found: 34
Total input Lines in script: 205

  Number of ID's processed: 1000

Starting Number: 80001

Last Number of Run: 81000

Summary for Web Automation Report for 09-05-2022 at 09:47:31
Average time of Loop from Main form: 00:00:22
2 Skip Exceptions were recorded.

Total time of Processing: 11:18:33

With changes made to try to understand the problems that showed up recently, multiple runs of a tool to extract any actual problems are needed. 

Analysis:
SpaceNut posts: 85
Real potholes: 2

Query Text Requested Was Found: Bad request
SK: Skipping to Next Item from Post ID: 80102 Bad request
Query Text Requested Was Found: Bad request
SK: Skipping to Next Item from Post ID: 80437 Bad request

However, those reports may have been due to stress of the run...
The two posts cited are manually confirmed. They are real potholes.

Tonight's Pothole scan will try for 82000

Update at 13:44 local time

This is just for the record .... work on the Simple OCR mini-project advanced a bit today.

There are five phases to generation of an XML file suitable for input to a spreadsheet, to show pixels captured from a live screen.

Today the input (and operator confirmation) steps were advanced through phase five.  All that remains is the (relatively straightforward) step of appending the trailer XML to the output file.

Left undone at this point are the details needed to actually match up screen capture data with the XML row model, to generate valid XML rows of pixel data.

Meanwhile, the [Edit Post] graphic was captured on the computer where it needs to be captured at run time.  I was (a bit) surprised to find that the pixels generated on the target machine (with 1024x768 resolution) were almost identical to those on a machine whose monitor is running at a much higher resolution.

I deduce from this observation that the browser must be working behind the scenes to render the FluxBB screen display so it looks the same on different machines. Carrying this thought further .... there must be "virtual" pixels, because the hardware underlying the display can have a number of physical pixels that differ from the virtual ones.

(th)

Offline

#2475 2022-09-06 06:37:39

tahanson43206
Moderator
Registered: 2018-04-27
Posts: 16,756

Re: Housekeeping

For SpaceNut re Pothole Rescans ...

WBA reached 82000 in last night's run ...

Completed Sequence for ID: 82000


Total Command Lines found: 34
Total input Lines in script: 206

  Number of ID's processed: 1000

Starting Number: 81001

Last Number of Run: 82000

Summary for Web Automation Report for 09-06-2022 at 08:37:44
Average time of Loop from Main form: 00:00:22
9 Skip Exceptions were recorded.

Total time of Processing: 15:12:03

The report of 9 skips implies 9 potholes, but (at this point) this is just a suggestion.

Detailed analysis is needed to establish what ** really ** happened.

Update at 10:48 ....It turns out that I've encountered a new failure mode, or rather rediscovered an old one.

In both recent runs, the laptop and FluxBB got out of sync ...

The run from 80000 failed at 80634

The run from 81000 failed at 80683

The runs from 79000 and 80000 appear to have completed without encountering this error.

An idea I'm considering is adjusting the script to halt if it finds the Address Bar contents where the page should be.

I ** think ** there's an existing script command to handle that.

In the mean time, I'll restart the scans from the respective failure points.

Starting from 80634 now ...

(th)

Offline

Board footer

Powered by FluxBB