If the data is temporal in nature, then we can use specialized algorithms called Sequence Models. These models can learn the probability of the form p(y|x_n, x_1), where i is an index signifying the location in the sequence and x_i is the ith input sample.
As an example, we can consider each word as a series of characters, each sentence as a series of words, and each paragraph as a series of sentences. Output y could be the sentiment of the sentence.
Using a similar trick from autoencoders, we can replace y with the next item in the series or sequence, namely y =x_n + 1, allowing the model to learn.