Do me a favor and listen to the "Chittering" in some UA-Sounds and compare them to some itransformed (andhere). Is is imagination or is there somethin in that chittering?
Just on an airport run but will get on it when I get back. I'm convinced their is at least one or more life forms in the UAs, especially after seeing an image just now, part of the UA looks suspiciously like an eggsack
Don't know if this is helpful, but there's a minor chord occuring sometimes, in sequence 4, 5, 6. It's a bit of an oddity, because most tones go up.
It's the straight horizontal lines between 300 and 600 hz, easily visible here:
Especially 4 and 5 look like they have almost the same effects and distortions on them.
Hope it's not ingame music or anything.
Oh and I feel there's also more slow "zombie" chatter in those 3 than usual.
I will fiddle around some more tomorrow ;-)
It is known that the howls are sometimes laid overtop of the chittering. And, I think, that the chittering sound continues throughout(?), but we use this phrasology for ease of reference.
When I say "chittering" I mean the lack of howls or purrs between the last howl (with or without associated purrs) and the next purr.
When I say "howl" I mean that loud "whale noise", which may or may not contain a purr (or two). "Howl" is much shorter to type than "whale noise".
The problem is that decoding a Morse Code message (especially one so long as this message seems to be) is computationally impossible without some indication of at least letter breaks.
How about an Egg that is crying for its parents? It looks a hell of a lot like an egg. The chitter and purrs could be the creatures inside while the howl would be the "mom where are you" cry.
Has anyone confirmed whether the sound is dependent on the health %? Not sure how long it takes for the sound to change pitch but if it's at a salvageable percentage it may be a good idea to dump it, record the sound until the pitch changes, pick it up, then dump it again. If the sound depends on the health% it will remain in higher pitch. If the sound is the same we'll know if it's the health that changes it or if it is unrelated to the health.
Edit: I want to find one of these. Are there people out there actively looking for them or are all the efforts aimed at analyzing the existing audio material?
I know little about sound engineering but I have very good ears (and speakers/headphones) and am very good at rational thinking.
I've been thinking along similar lines, we could be looking for more than is there. The sound definitely sounds like something alive, maybe that hints that there are creatures in there.
The (apparent) structure is more convincing than the variety. In fact, it could have more variety and have less evidence for something hidden in there.
The fact that no three consecutive purrs share tone is kind of a big deal. If there were no such structure. If the tones were convincingly random, I would have much less suspicion that something is hiding in the message, but there would be far more variety.
Took me a while to find it but the background sound "the buzzing" how I like to call it reminds me of the TOS Episode "wink of an Eye" where the aliens disappeared because they sped way up and were outside the visible spectrum. Someone said in the episode there was even a hidden message...
Hey guys, quick update on my weekend work. So I spent much of time doing a fairly lengthy transcription of one of the long audio recordings. It's preliminary, but I think I may have a significantly different structure for the high/low bit notes; once I free up later today I'll re-analyze it and report back.
With regards trying to extract some form of message from these sounds, there are a few fairly basic alternatives.
There is a static message - possibly encrypted.
There is a changing message - a countdown, or a variation based on location.
There is no message - and it is effectively random noise.
For option 1 we would expect to see large sections of message repeated, even if it is in some way encrypted.
For option 2 we would probably still expect to see large chunks of message repeated.
For option 3 we would expect to find much less in the way in repeated sections.
So the first step should be to see if there is a repeating sequence. Most importantly, we need to be able to see this sequence in more than two samples.
I've been trying to parse the trasnscripts as posted using some simple scripts that treat the data as a bitstream and look for common sequences between two transcriptions, and there really isn't a repeating sequence there.
I used 23 transcribed sequences of typically 70-80 'bits' long, the longest matches I found including uknown/missing data were these (first two numbers reference my dataset so that I can hand check things)
Notice that in both these cases I'm not seeing the same sequence match in more than two transcriptions. I could do some further work to quantify how many of the shorter matches appear in multiple transcriptions, but, to be honest, given the results from checking for matches between two transcriptions I don't think it's worth the effort...
So, based on the data seen so far I would say there isn't any message hidden in the way that people are currently looking at things.
We appear to be seeing some form of generated sequence
It follows rules in timing and sequence of sounds (e.g. not repeating the same sound three times in a row)
I think that a lot of what is being dicussed is a case of of coming up with a theory and then hunting through the data in an attempt to find some segment that fits that theory. Whilst this is a very normal thing for humans to do when solving problems and it works very well for the types of problem where we understand the parameters (it's my appraoch to basic problem solving in my day job) it's prone to leading you down blind alleys when the problem lies outside our normal experience, and/or there are a number of unknown parameters. It can lead to some fun and entertaining theories so I'm still enjoying catching up with this =)
“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”
― Arthur Conan Doyle, Sherlock Holmes
I've also tried various options like reversing the bit order, or inverting them, but that doesn't seem to make much difference.
But here are the results, in case they give anyone else some ideas.
To me, this is an indication that a range of numbers is encoded, but not the entire available range is used.
For example, if I encode the letters A - Z in 32 bit numbers, the highest bit will be a 0 sixteen times, and a 1 only 10 times.
I've been doing a bit more digging in the raw data. This time looking at adjacent n-grams. This approach has some limitations when applied to something as simple as a (potentially) binary string, but it's kept me busy during a couple of lunch hours =)
This is looking at the data within each 6-7 bit clause, tests that include an unknown digit in either the initial or following n-gram are dropped and not included in the results.
Starting with the simplest analysis. How does an initial n-gram of length 1 influence a following n-gram of length 1? We see that roughly 2/3 of the time the 'opposing' digit is chosen, this is the start of a pattern that continues as we look at longer n-grams.
Code:
0 0 32.26%
0 1 67.74%
1 0 67.33%
1 1 32.67%
Moving forward to look at an initial n-gram length of 2 and a following n-gram length of 1 we see a very clear pattern, one that was noticed already very early on; there are no cases where we see the same digit three times in a row. Notice that where the digits in the initial n-gram are not the same the distribution of the following n-gram is pretty much even.
When we look at the longer n-gram pairs we see the same thing; wherever the distribution is not even it's because it's being influenced by the 'no digit repeated three times' rule. For some of the longer n-grams the distribution is less even, most likely because of the smaller sample size for each n-gram pair.
Initial n-gram length 1, following n-gram length 2
I've been doing a bit more digging in the raw data. This time looking at adjacent n-grams. This approach has some limitations when applied to something as simple as a (potentially) binary string, but it's kept me busy during a couple of lunch hours =)
...Notice that where the digits in the initial n-gram are not the same the distribution of the following n-gram is pretty much even.
...As a result I stand by my initial conclusion that there is no message encoded in the data in this way.
I don't like the conclusion, but that is fair evidence for it, and I'm forced to consider the possibility much more likely. (I was toying with ideas of a repeating message encoded within a non-repeated string, but I suspect there would be more deviation in distribution than what you see). Nice work. A deception also dovetails with the tone's volume-envelope cutting off bits and corrupting the data through what (for a signal) looks more like sloppiness rather than anything useful or meaningful.
If so, then this was huge missed opportunity for Elite, and too big to just leave on the table - if nothing comes of it with 1.3, then I'll look into implementing this gameplay properly for the game franchise I design for
There has been some speculation in the main relic thread that Manchester encoding may be a factor, but this has issues with the transcripted data as it stands. I have a couple of ideas related to this which I intend to look at over the weekend. We'll see what those turn up. Interestingly I was drinking with a couple of my ore security-minded friends last night and this topic came up. It lead to some interesting discussion about the characteristics of encoded/enciphered information when viewed as raw data. So that was interesting...
It's also worth noting that I've been working off the great transcription work done by other people. I'm not listening to the sounds myself. These transcriptions are just one way of looking for information within one aspect of the sounds, high and low tones, and interpreting them in one manner, a binary string or strings. It's still possible that information could be present within other aspects of the sounds as a whole or when viewed in a different manner. What we're doing here isn't science, but in scientific investigation looking for something and not finding it can be equally as informative as finding it.
I've certainly enjoyed taking this raw data and looking for for indications that information is present without trying to impose a context on it (which is what I'm at least trying to do here), and I intend to carry on doing that as long as it's still fun and interesting. This whole thing has also made me much more aware of just how good the sound work is in this game. Sitting wearing the Rift with headphones on the middle of nowhere (I'm off exploring at the moment) and just parking the ship and listening to things actually made me physically shiver at one point this week. Great stuff! =)
It's also worth noting that I've been working off the great transcription work done by other people. I'm not listening to the sounds myself. These transcriptions are just one way of looking for information within one aspect of the sounds, high and low tones, and interpreting them in one manner, a binary string or strings. It's still possible that information could be present within other aspects of the sounds as a whole or when viewed in a different manner.
This is the same conclusion as I've been coming to over the last few days; despite the work of the people who transcribed the recorded audio, I don't think we've been transcribing all or the right information.
I also think Xakaz may have been onto something here
Perhaps, for now, this is a better way of visually representing the audio without potentially ignoring two thirds of the data it may contain.
There is an additional (unlikely) possibility - if the data is compressed, then we would expect to see the same distribution as random (because random distribution cannot be compressed further, meaning that compression algorithms produce output closer and closer to random distribution the more effective they become) however I think a compression scheme is unlikely because it would be bad puzzle design (unless the plan is to hand us decompression-method clues at a later date), so I'm not going to investigate that avenue, I'm just assuming that it's not the case.
There is an additional (unlikely) possibility - if the data is compressed, then we would expect to see the same distribution as random (because random distribution cannot be compressed further, meaning that compression algorithms produce output closer and closer to random distribution the more effective they become) however I think a compression scheme is unlikely because it would be bad puzzle design (unless the plan is to hand us decompression-method clues at a later date), so I'm not going to investigate that avenue, I'm just assuming that it's not the case.
This is also the case if the information is enciphered in a way the meets the criteria for Perfect Secrecy such as using a one-time-pad. This was something I wasn't aware of until the discussion I had in the pub last night.
However, if there's information there but enciphered then we're really up against it, but I don't think that this is the case.
Been a bit out of the loop recently however I've been busy behind the scenes.
I've been putting together a bit of software that will model the recorded audio sequences in order to facilitate any kind of patterns analysis we want. It imports label exports from Audacity to model the sequences in a structured (and most importantly queryable format) with start and end time stamps, along with other metadata, for all elements.
It is built in .Net and the object model and import process is almost complete. I haven't started on a UI yet, but Linq can be used to analyse the sequences and output the results to console. I will get it on Git as soon as I am able. Anyone any good at .Net UI design, as this is likely to be the bit that takes me longest to get in place!
Another benefit of this approach is that we can establish a set of transcriptions using audacity projects and an agreed set of label tags. We can use these to discuss/agree message content and I hope can develop into an approach to establishing a standard data-set as there seems to be quite a lot of inconsistency in the current transcriptions.
Marking up a recording in Audacity using labels is pretty straight forward, so as new recordings come it it will be easy to add them to this common dataset.
Anyone got any views on useful properties to capture for the audio elements in addition to start and end timestamps? At the moment I have sequences, which are formed of Segments. Segments start with a chitter and and with a Howl. They contain a collection of Purrs which can be any length.
Sequences are flagged with Recorder, System, Capture Time, and a decription
Howls just capture whether they are type 1 or 2.
Purrs capture an indentifer for the bit value, whether they are clipped/quiet and can also capture the frequency (if anyone can be bothered to work that out)
Chitters are just time stamps