Jump to content
IGNORED

Call for help fixing ATX support.


Recommended Posts

Hello, everyone.

 

The ATX support on FujiNet is broken (as part of the move to the latest ESP32 vendor toolkit porting that we had to do.), as it seems that angular position timing no longer works correctly.

 

I have been pouring over this problem for the last few weeks, and am no closer to a viable solution (all the usual low hanging fruit has already been tried, such as increasing the timer resolution for the angular position variable.), and could really use an extra pair of eyes, even from neighboring projects (such as SDrive-MAX)

The code is completely contained within here:
https://github.com/FujiNetWIFI/fujinet-platformio/blob/master/lib/media/atari/diskTypeAtx.cpp

 

Thanks,
-Thom

Link to comment
Share on other sites

a viable solution would be to settle on the best tool chain possible and stop chasing the latest shiny new one unless it fixes something major and has been working in the general public for some time. You have had almost every part of fujinet working at some point and then toolchain this or that breaks this or that, so everything is never fully working all at once. This is problematic not only for you but for the thousands of FN owners such as myself.

Moving targets are hard to hit, and not everyone is a marksman.

Edited by _The Doctor__
  • Like 1
Link to comment
Share on other sites

11 minutes ago, _The Doctor__ said:

a viable solution would be to settle on the best tool chain possible and stop chasing the latest shiny new one unless it fixes something major and has been working in the general public for some time. You have had almost every part of fujinet working at some point and then toolchain this or that breaks this or that, so everything is never fully working all at once. This is problematic not only for you but for the thousands of FN owners such as myself.

Moving targets are hard to hit, and not everyone is a marksman.

The problem is, we can't do that. Locking to a specific version and staying there is _NOT_ a viable solution.

 

There have been situations where e.g. compilers disappear from repositories when older toolchains start to fall away, to say nothing of the documentation. So we are forced to keep things running on the latest toolchains.

 

I am making the best of a situation I can't reliably control.

 

-Thom

 

(and I am sorry, I need to go on a tear here, I get this statement from people who have _NEVER_ had to not only maintain, but extend a large code-base over a long period of time. It is exceedingly unrealistic in today's world, where every bit of documentation and toolchain software isn't on a disk or a slice of dead tree somewhere. I did not create this reality, but I have to live in it!)

 

Edited by tschak909
  • Like 1
Link to comment
Share on other sites

2 hours ago, _The Doctor__ said:

a viable solution would be to settle on the best tool chain possible and stop chasing the latest shiny new one unless it fixes something major and has been working in the general public for some time. You have had almost every part of fujinet working at some point and then toolchain this or that breaks this or that, so everything is never fully working all at once. This is problematic not only for you but for the thousands of FN owners such as myself.

Moving targets are hard to hit, and not everyone is a marksman.

To me that sounds like "Why does anyone switch to Windows 10, when XP is still running stable and fulfills all MY needs, but why is this weird developer of Altirra no longer supporting it?" 

  • Like 1
Link to comment
Share on other sites

Not the same thing at all, altirra continues to work and fujinet doesn't. If you know that this is going to happen, stick with the thing that works until the new one is verified and working. No one wants to help because all of their efforts go for naught, at least that's the word on the street so to speak and I can see why. If the toolchain breaks the device and it no longer works properly for the Atari that's not the same as it won't work on the wintel OS of my choice. It's whack a mole and it doesn't need to be. Like one can't keep the tools on their hard drive or keep notes on dead trees. This becomes farcical at best.

Edited by _The Doctor__
Link to comment
Share on other sites

11 minutes ago, _The Doctor__ said:

Not the same thing at all, altirra continues to work and fujinet doesn't. If you know that this is going to happen, stick with the thing that works until the new one is verified and working. No one wants to help because all of their efforts go for naught, at least that's the word on the street so to speak and I can see why. If the toolchain breaks the device and it no longer works properly for the Atari that's not the same as it won't work on the wintel OS of my choice. It's whack a mole and it doesn't need to be. Like one can't keep the tools on their hard drive or keep notes on dead trees. This becomes farcical at best.

It's not like we didn't try! We spent two years at the same toolkit version, until it literally fell off the repository. It's kind of a pain in the ass, when you have a new developer that comes on board, and they can't build the same environment because e.g. the compiler is now not available. And the straw that broke the camel's back, was that we needed things in the newer IDF that simply weren't available in the one we were currently using.

 

-Thom

  • Like 1
Link to comment
Share on other sites

@tschak909

I was told the buffer is 128 bytes, it only ever reports 127 when full, if it reports zero when empty that's off by one, since zero actually is zero in this case. If the buffer is an actual buffer the buffer doesn't always contain an EOL at the 128th byte does it?

curious because these answers are hard to come by. BASIC likes 119 as it's standard unless you extend it. So with that in mind you need to switch from record mode to byte mode when the buffer is more than full otherwise you will get a truncation error. If you simply add one to the buffer count when it really is 127 bytes instead of 128, you have to wait for the fujinet to time out when using the N: device with BASIC or so it seems?

It will hold the Atari as if it were locked up for up to 30 or until another byte arrives in the buffer. Hopefully some light can be shed on this.

 

Edited by _The Doctor__
Link to comment
Share on other sites

3 hours ago, _The Doctor__ said:

Not the same thing at all, altirra continues to work and fujinet doesn't. If you know that this is going to happen, stick with the thing that works until the new one is verified and working. No one wants to help because all of their efforts go for naught, at least that's the word on the street so to speak and I can see why. If the toolchain breaks the device and it no longer works properly for the Atari that's not the same as it won't work on the wintel OS of my choice. It's whack a mole and it doesn't need to be. Like one can't keep the tools on their hard drive or keep notes on dead trees. This becomes farcical at best.

That's not as good of an example as you might think. Altirra has been upgraded from Visual Studio 2005 all the way to Visual Studio 2022 in C++20 mode, and these days I upgrade both the toolchain and the C++ standard pretty aggressively. It also has been broken before by bugs in the toolchain. But that's far better than staying on an old toolchain for too long. Also, I don't have to deal with a real-time embedded environment and can use sanitizer tools that are impractical to run on-device.

 

  • Like 5
Link to comment
Share on other sites

Just now, phaeron said:

That's not as good of an example as you might think. Altirra has been upgraded from Visual Studio 2005 all the way to Visual Studio 2022 in C++20 mode, and these days I upgrade both the toolchain and the C++ standard pretty aggressively. It also has been broken before by bugs in the toolchain. But that's far better than staying on an old toolchain for too long. Also, I don't have to deal with a real-time embedded environment and can use sanitizer tools that are impractical to run on-device.

 

burn meme Meme | Meaning & History | Dictionary.com

  • Like 1
Link to comment
Share on other sites

9 hours ago, tschak909 said:

The ATX support on FujiNet is broken (as part of the move to the latest ESP32 vendor toolkit porting that we had to do.), as it seems that angular position timing no longer works correctly.

 

I have been pouring over this problem for the last few weeks, and am no closer to a viable solution (all the usual low hanging fruit has already been tried, such as increasing the timer resolution for the angular position variable.), and could really use an extra pair of eyes, even from neighboring projects (such as SDrive-MAX)

The code is completely contained within here:
https://github.com/FujiNetWIFI/fujinet-platformio/blob/master/lib/media/atari/diskTypeAtx.cpp

Do you have any specifics on how it's broken? The first test I would do is to run any of the standard disk RPM tests and see if at least the disk routines respond with 288 RPM timing -- if it doesn't, you have a very stable baseline for how it should be responding to check the timer code.

 

Link to comment
Share on other sites

Just now, phaeron said:

Do you have any specifics on how it's broken? The first test I would do is to run any of the standard disk RPM tests and see if at least the disk routines respond with 288 RPM timing -- if it doesn't, you have a very stable baseline for how it should be responding to check the timer code.

 

Pretty much anything that relies on accurate sector timing seems to take too long, but it's marginal. 

I have been using DJaybee's ATX test suite (CheckProtection), and it fails on the copy protections that rely on accurate sector skewing (e.g. broderbund, synapse) . 

 

And yes, VERY GOOD IDEA actually. I will load an RPM tester! :)

 

-Thom

Link to comment
Share on other sites

21 minutes ago, tschak909 said:

Pretty much anything that relies on accurate sector timing seems to take too long, but it's marginal. 

I have been using DJaybee's ATX test suite (CheckProtection), and it fails on the copy protections that rely on accurate sector skewing (e.g. broderbund, synapse) . 

 

And yes, VERY GOOD IDEA actually. I will load an RPM tester! :)

 

-Thom

OK, so I grabbed an RPM testing program, plopped it onto a disk with an autorun, and converted it to an ATX disk, so that it would be sure to use the ATX media type (and all the accurate timing bits that go along with it), it seems to be stable mostly.

 

 

Hmmm...

 

-Thom

Link to comment
Share on other sites

The next step I would try is just booting a plain DOS 2 disk with standard interleave and compare the timings. If it's grossly off you may even be able to hear the difference. Next step beyond that would probably be to check the logs from the failed text runs and determine if the problem is the emulation returning the wrong data or wrong timings. If it's wrong data then it should be pretty easy to track down what sectors it should have returned and why it didn't.

 

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

I didn't bring altirra into it but it supported XP for a hell of a long time for which many were grateful please don't make it sound like I am currently complaining about it. To your credit, you appear more than capable to keep up with the current update practice which you now employ. As noted something you do currently but not necessarily as aggressively in the past.

 

I didn't use that as an example someone else did in any event. I did however point out that altirra works. But to clarify and to your credit when small things were broken, you listened and fixed them straight away. We don't get the whole run around and run down each time something is asked, mentioned, or pointed out. You simply provide answers, solutions, or let us know it's going to take some work to fix whatever it is. Many times you provide a way to do it correctly or how to use a work around.

 

Notice I didn't get an answer to my question about the buffer? Even with the sited observation as an example? We appear more worried about how things are perceived as opposed to how it works.

5 hours ago, _The Doctor__ said:

@tschak909

I was told the buffer is 128 bytes, it only ever reports 127 when full, if it reports zero when empty that's off by one, since zero actually is zero in this case. If the buffer is an actual buffer the buffer doesn't always contain an EOL at the 128th byte does it?

curious because these answers are hard to come by. BASIC likes 119 as it's standard unless you extend it. So with that in mind you need to switch from record mode to byte mode when the buffer is more than full otherwise you will get a truncation error. If you simply add one to the buffer count when it really is 127 bytes instead of 128, you have to wait for the fujinet to time out when using the N: device with BASIC or so it seems?

It will hold the Atari as if it were locked up for up to 30 or until another byte arrives in the buffer. Hopefully some light can be shed on this.

 

it would be nice if we had some road map to know what is broken in which firmware so it would be possible to pick the best one for our needs. I am using 1.0 now. It's mostly use-able. I don't think flashing every version to the fujinet and trying them all is viable. It isn't good to keep flashing it over and over, and it's impractical.

Edited by _The Doctor__
Link to comment
Share on other sites

1 hour ago, _The Doctor__ said:

I didn't bring altirra into it but it supported XP for a hell of a long time for which many were grateful please don't make it sound like I am currently complaining about it. To your credit, you appear more than capable to keep up with the current update practice which you now employ. As noted something you do currently but not necessarily as aggressively in the past.

 

I didn't use that as an example someone else did in any event.

This is true, it was me who brought i up.

 

And for a reason:

If you need a new toolchain for technical reasons (like compilers are no longer running), you must migrate to it to enable further development, even if you would like to stay with the old one and even if it breaks compatibility with certain OSs/platforms/whatever.

 

This happened to phaeron developing Altirra and the same happened to the team of developers working on FujiNet.

But you are welcome to cry over the old days for a while longer.

 

Oh, and similarly to Altirra: Old versions of the firmware are still working on new devices. Unless you need one of the later added features, there is no need to update it.

Edited by DjayBee
Link to comment
Share on other sites

since we just keep pushing and don't seem to get it... since 2019...

I was happy that the basic modem emulation was fixed, but then the other stuff appears broken, It still doesn't answer what the most functional version of the firmware is, it still doesn't answer the buffer question.

You see, I bought this car and I can't drive it, I would like to, but when I tried, the air fuel mixture was off the engine shuts off on longer drives, I took it in and they fixed the air fuel mixture but they somehow messed up the  sensor code that reads the tire rotation for ABS and the thing keeps applying the brakes, so I take it back in and the updates are done and now the engine runs but the brakes instead of coming on all by themselves,periodically now don't want to engage at all, so I bring it in again, and they fix all that, but on long trips the computer fills up and trashes all of it's setting requiring you to sit and either get a programmer and program it yourself or have it in to the shop again to do it. So now it's all good except it keeps forgetting it's user settings making the seats move all the way up and the mirrors all point in the wrong directions and the steering wheel tilts down and out, the pedals move toward the seat and you can't get in the car. okay we fix that, but now the things can't always read the keys security chip properly and it simply stays in valet mode, good for 15 minutes, these aren't aesthetic issues or and issue like the dome light doesn't come on because the body control module is borked. You seem to want to waste time with these sorts of defenses and comparisons. I just want a working driving device. The assurances come but the answers to basic questions never seem to.

 

I have been able to use the modem again, the .cas function was working to load tapes again, and can be used with real datasette with a workaround. and it no longer trashes the tnfs list or tosses the wifi config out to which I gave credit and is one of the reasons I had hope and purchased a 1.6 (immediately superseded by 1.7 before the thing arrived at my door), but it still can't function long term, may or may not have buffer issues but could that be a symptom of something else. Trying to work with this and fix / repair existing programs so they are more than a curiosity or neat demo is not possible if the specifications are not easily discerned and you can't get answers. If you do ask and give observations you get memes and sighs etc. lots of defenders of the crown appear. It still doesn't help. I do appreciate Mozzwald in that he can sometimes get answers and pass them along. I would much rather just have some answers. In order to fix a thing you need a list of everything that was wrong with previous versions, but where is that list? I mean the github issues hardly covers it and they are not in a simple sheet to look at. The specifications need to be gleaned from videos and a possible spartan example... simple questions are go unacknowledged or unanswered. I don't want this to go into a drawer or to have to give another one away since I will inevitably get asked if I gave someone a broken thing. No it's not a broken hardware thing, it's a firmware thing, they suggest you update... rinse repeat.

Edited by _The Doctor__
Link to comment
Share on other sites

14 minutes ago, _The Doctor__ said:

since we just keep pushing and don't seem to get it... since 2019...

I was happy that the basic modem emulation was fixed, but then the other stuff appears broken, It still doesn't answer what the most functional version of the firmware is, it still doesn't answer the buffer question.

You see, I bought this car and I can't drive it, I would like to, but when I tried, the air fuel mixture was off the engine shuts off on longer drives, I took it in and they fixed the air fuel mixture but they somehow messed up the  sensor code that reads the tire rotation for ABS and the thing keeps applying the brakes, so I take it back in and the updates are done and now the engine runs but the brakes instead of coming on all by themselves,periodically now don't want to engage at all, so I bring it in again, and they fix all that, but on long trips the computer fills up and trashes all of it's setting requiring you to sit and either get a programmer and program it yourself or have it in to the shop again to do it. So now it's all good except it keeps forgetting it's user settings making the seats move all the way up and the mirrors all point in the wrong directions and the steering wheel tilts down and out, the pedals move toward the seat and you can't get in the car. okay we fix that, but now the things can't always read the keys security chip properly and it simply stays in valet mode, good for 15 minutes, these aren't aesthetic issues or and issue like the dome light doesn't come on because the body control module is borked. You seem to want to waste time with these sorts of defenses and comparisons. I just want a working driving device. The assurances come but the answers to basic question never seem to.

 

I have been able to use the modem again, the .cas function was working to load tapes again, and can be used with real datasette with a workaround. and it no longer trashes the tnfs list or tosses the wifi config out to which I gave credit and is one of the reasons I had hope and purchased a 1.6 (immediately superseded by 1.7 before the thing arrived at my door), but it still can't function long term, may or may not have buffer issues but could that be a symptom of something else.

 

 

All I can ask, is that you file issues with things that aren't working, and we will fix them, as we can.

 

This is an open project, people come in, add things, sometimes things break, we try to fix, and the one thing I always make time for, is to dump my knowledge into the heads of people who want to work on FujiNet and Fujinet things.

 

I will say that this should have probably been pulled into another thread.

 

-Thom

 

Edited by tschak909
Link to comment
Share on other sites

When you find the time, is the buffer issue I mentioned a bug or is there something else at play?

If I ask you directly or ask mozzwald or share an observation with either, I consider that informing on the issue. If we get an answer, solution/explanation that's cool. Maybe another thread would have been in order but it's run it's course.

 

further observation

Sometimes there are remnants transmitted on output I am being told, so from BASIC is there a way to clear the buffer without flushing it to network? Since you mention memory leaks in this thread, it might be related if data were leaking to places it shouldn't or the buffer simply is not zeroing when being recycled. normally this happens at the end of and beginning of back to back record mode outputs.

 

Edited by _The Doctor__
Link to comment
Share on other sites

(since you seem hell-bent on talking about it here)

 

No, this isn't related.

The CIO (N:) handler reads data in 127 byte chunks (this was so I could check whether the buffer was full with a BPL)

https://github.com/FujiNetWIFI/fujinet-nhandler/blob/master/handler/src/ndev.s#L193

 

For write, two conditions cause a flush of the buffer out to SIO:

(1) a new line ($9B) is sent

(2) an XIO Command 15 is sent to flush the data out SIO: https://github.com/FujiNetWIFI/fujinet-nhandler/blob/master/handler/src/ndev.s#L477

 

The only way to clear the buffer is to close.

 

-Thom

Link to comment
Share on other sites

5 hours ago, tschak909 said:

@phaeron we don't turn our "motor" off, we're always spinning. Would this be an issue?

-Thom

I doubt it, protections typically do all their checks with the motor continuously running and some drives have long motor idle timeouts (>6 seconds). Highly doubt any accuracy issues would be related to this. FWIW, Altirra doesn't emulate spin-up/spin-down, it simply instantaneously pops the motor between completely stopped and full speed.

 

(Edit: Might be a good idea to check for overflows in your arithmetic, especially if the tick base might have changed in the toolchain or you're using signed arithmetic. Altirra uses unsigned 32-bit cycles for tracking rotations, so it has a known stable clock rate with controlled overflow behavior.)

Edited by phaeron
  • Like 1
Link to comment
Share on other sites

Just now, phaeron said:

I doubt it, protections typically do all their checks with the motor continuously running and some drives have long motor idle timeouts (>6 seconds). Highly doubt any accuracy issues would be related to this. FWIW, Altirra doesn't emulate spin-up/spin-down, it simply instantaneously pops the motor between completely stopped and full speed.

yup, all signs are pointing to some unaccounted for delays (e.g. memory transfer from PSRAM), this is gonna be fun to solve... :)

 

I can hear some of you saying, "Why not use git bisect?" This only works if you have the older toolkits in your cache. Our project is old enough that the ESP32 PLATFORMIO bindings have gone through SEVERAL deprecation cycles.

 

-Thom

Link to comment
Share on other sites

14 hours ago, tschak909 said:

I can hear some of you saying, "Why not use git bisect?" This only works if you have the older toolkits in your cache. Our project is old enough that the ESP32 PLATFORMIO bindings have gone through SEVERAL deprecation cycles.

 

'git bisect' is no longer viable, but you do have an archive of all the firmware revisions, yes?  You could do a manual bisect on the firmware images to find the revision where the regression occurred, which may narrow down the code or toolset change responsible.

  • Like 1
Link to comment
Share on other sites

16 minutes ago, FifthPlayer said:

 

'git bisect' is no longer viable, but you do have an archive of all the firmware revisions, yes?  You could do a manual bisect on the firmware images to find the revision where the regression occurred, which may narrow down the code or toolset change responsible.

yup indeed. This is another good plan of attack. Thank you.

-Thom

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...